Docs

A compact quickstart for launch: auth, base URL, integration examples, and model naming guidance for the Agumbe LLM gateway.

Open models endpoint

QuickstartThe smallest believable path from account to first successful request.

Create a gateway API key from the Tokens page, or use your signed-in session for first-run testing in the playground.

Use an app-scoped API key when one app should always use the same guardrails, or a tenant-scoped API key when you need to choose the app per request.

Call the Agumbe gateway endpoints at https://api.agumbe.ai and use /api/v1/llm/models to inspect supported models.

Track wallet health, requests, and basic spend inside the Billing, Requests, and Dashboard pages.

Auth notesUse a signed-in session for first-run testing in the console, or a bearer API key for backend and production traffic.

Bearer token

Use Authorization: Bearer AGUMBE_API_KEY for direct API access from your application. App-scoped API keys always use the bound app’s guardrails. Tenant-scoped API keys let you choose the app per request.

Session auth

The console can also proxy requests with your authenticated browser session for a fast first-run experience before you mint a dedicated API key.

Guardrails stance

Guardrails are opt-in and app-level first. The MVP surface covers prompt injection, PII, secrets, denied topics, groundedness, allowed models, max tokens, and rate limits.

API key typesChoose how app guardrails are resolved when your backend calls the gateway.

Tenant-scoped API key

Use one key across multiple apps. Choose the app’s guardrails per request by sending agumbe_guardrails_app_id.

App-scoped API key

Bind the key to one app. That app’s guardrails are applied automatically on every request, and the key cannot be used for another app.

How guardrails are appliedThe app owns the guardrail policy. The API key determines how the app is resolved.

Tenant-scoped key example

Choose the app per request by sending agumbe_guardrails_app_id.

{
  "model": "gpt-5.2",
  "messages": [
    { "role": "user", "content": "Hello" }
  ],
  "agumbe_guardrails_app_id": "app_support"
}

App-scoped key example

The bound app is used automatically. No per-request app id is required.

{
  "model": "gpt-5.2",
  "messages": [
    { "role": "user", "content": "Hello" }
  ]
}

If an app-scoped key is used with a different app id, the gateway rejects the request with 403 app_mismatch.

Gateway request examplesUse the same base URL for chat completions and embeddings.

curl https://api.agumbe.ai/api/v1/llm/chat/completions \
  -H "Authorization: Bearer $AGUMBE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.2",
    "messages": [
      { "role": "system", "content": "You are Agumbe AI." },
      { "role": "user", "content": "Summarize this support ticket in one sentence." }
    ],
    "max_completion_tokens": 180
  }'

After the playgroundWhat developers should do once a request works and the team is ready to integrate it into a real codebase.

Lock the request shape in the playground

Use the playground to settle on the model, prompt structure, token ceiling, and guardrail policy you want for the app.

Choose the right API key type

Use an app-scoped API key when one app should always use the same guardrails. Use a tenant-scoped API key when your system needs to choose the app per request.

Call the gateway from your backend

Treat Agumbe as your single LLM base URL. Keep provider switching, allowed models, rate limits, and app-level guardrails behind that one integration.

Operate through usage and requests

Use Dashboard, Billing, Requests, and Guardrails to watch latency, errors, token usage, and policy outcomes after the app goes live.

Integration code examplesServer-side examples using a dedicated Agumbe API key with an OpenAI-compatible request shape.

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.AGUMBE_API_KEY!,
  baseURL: "https://api.agumbe.ai/api/v1/llm"
});

export async function runSupportSummary(ticketText: string) {
  const response = await client.chat.completions.create({
    model: "gpt-5.2",
    messages: [
      { role: "system", content: "You summarize support tickets." },
      { role: "user", content: ticketText }
    ],
    max_completion_tokens: 220
  });

  return response.choices?.[0]?.message?.content ?? "";
}

How teams should use the gatewayUse the gateway as the stable integration layer between your application code and the underlying model providers.

Prefer backend-to-gateway traffic

For production systems, call the gateway from your API, worker, or service layer. That keeps app tokens off end-user devices and gives you one place to add retries, fallbacks, and audit logging.

Scope API keys by app or environment

Create separate API keys for staging, production, and high-sensitivity apps so guardrails, billing visibility, and future limits map cleanly to each workload.

Use the console as the control plane

The console is where teams manage auth, tokens, request visibility, billing, and guardrails. Your product code should just call the gateway and rely on those controls centrally.

Model naming examplesUse the model ids returned from /api/v1/llm/models. Agumbe supports OpenAI-compatible aliases as well as catalog-backed model ids.

gpt-5.2

gpt-4.1-mini

text-embedding-3-small