Docs
A compact quickstart for launch: auth, base URL, integration examples, and model naming guidance for the Agumbe LLM gateway.
Create a gateway API key from the Tokens page, or use your signed-in session for first-run testing in the playground.
Use an app-scoped API key when one app should always use the same guardrails, or a tenant-scoped API key when you need to choose the app per request.
Call the Agumbe gateway endpoints at https://api.agumbe.ai and use /api/v1/llm/models to inspect supported models.
Track wallet health, requests, and basic spend inside the Billing, Requests, and Dashboard pages.
Bearer token
Use Authorization: Bearer AGUMBE_API_KEY for direct API access from your application. App-scoped API keys always use the bound app’s guardrails. Tenant-scoped API keys let you choose the app per request.
Session auth
The console can also proxy requests with your authenticated browser session for a fast first-run experience before you mint a dedicated API key.
Guardrails stance
Guardrails are opt-in and app-level first. The MVP surface covers prompt injection, PII, secrets, denied topics, groundedness, allowed models, max tokens, and rate limits.
Tenant-scoped API key
Use one key across multiple apps. Choose the app’s guardrails per request by sending agumbe_guardrails_app_id.
App-scoped API key
Bind the key to one app. That app’s guardrails are applied automatically on every request, and the key cannot be used for another app.
Tenant-scoped key example
Choose the app per request by sending agumbe_guardrails_app_id.
{
"model": "gpt-5.2",
"messages": [
{ "role": "user", "content": "Hello" }
],
"agumbe_guardrails_app_id": "app_support"
}App-scoped key example
The bound app is used automatically. No per-request app id is required.
{
"model": "gpt-5.2",
"messages": [
{ "role": "user", "content": "Hello" }
]
}curl https://api.agumbe.ai/api/v1/llm/chat/completions \
-H "Authorization: Bearer $AGUMBE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.2",
"messages": [
{ "role": "system", "content": "You are Agumbe AI." },
{ "role": "user", "content": "Summarize this support ticket in one sentence." }
],
"max_completion_tokens": 180
}'Lock the request shape in the playground
Use the playground to settle on the model, prompt structure, token ceiling, and guardrail policy you want for the app.
Choose the right API key type
Use an app-scoped API key when one app should always use the same guardrails. Use a tenant-scoped API key when your system needs to choose the app per request.
Call the gateway from your backend
Treat Agumbe as your single LLM base URL. Keep provider switching, allowed models, rate limits, and app-level guardrails behind that one integration.
Operate through usage and requests
Use Dashboard, Billing, Requests, and Guardrails to watch latency, errors, token usage, and policy outcomes after the app goes live.
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.AGUMBE_API_KEY!,
baseURL: "https://api.agumbe.ai/api/v1/llm"
});
export async function runSupportSummary(ticketText: string) {
const response = await client.chat.completions.create({
model: "gpt-5.2",
messages: [
{ role: "system", content: "You summarize support tickets." },
{ role: "user", content: ticketText }
],
max_completion_tokens: 220
});
return response.choices?.[0]?.message?.content ?? "";
}Prefer backend-to-gateway traffic
For production systems, call the gateway from your API, worker, or service layer. That keeps app tokens off end-user devices and gives you one place to add retries, fallbacks, and audit logging.
Scope API keys by app or environment
Create separate API keys for staging, production, and high-sensitivity apps so guardrails, billing visibility, and future limits map cleanly to each workload.
Use the console as the control plane
The console is where teams manage auth, tokens, request visibility, billing, and guardrails. Your product code should just call the gateway and rely on those controls centrally.
gpt-5.2
gpt-4.1-mini
text-embedding-3-small