Documentation Index
Fetch the complete documentation index at: https://docs.cynsta.com/llms.txt
Use this file to discover all available pages before exploring further.
Spendra’s gateway is designed to feel like an OpenAI-compatible endpoint while enforcing Spendra policies before provider spend occurs. OpenAI-compatible Responses, Files, and Chat Completions routes proxy to OpenAI; provider-specific Chat Completions routes can proxy to additional upstream providers.
Authentication
Gateway requests authenticate with a Spendra scoped key:
Authorization: Bearer spk_live_<key_id>_<secret>
The secret is shown once at key creation. Store it in the client or agent secret manager, not in source control.
OpenAI SDK setup
OpenAI integrations keep the OpenAI SDK and change the API key and base URL:
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.SPENDRA_API_KEY,
baseURL: "https://api.spendra.example/v1",
});
Use the API hostname from your Spendra deployment. In production, prefer a stable internal or customer-owned domain instead of an ephemeral platform URL.
Provider-specific chat
Spendra supports provider-specific Chat Completions routing at:
POST /v1/providers/{provider}/chat/completions
Supported provider IDs for this route are openai, openrouter, google, vertexai, azure, and anthropic.
Example:
curl https://api.spendra.example/v1/providers/openrouter/chat/completions \
-H "authorization: Bearer $SPENDRA_API_KEY" \
-H "content-type: application/json" \
-H "idempotency-key: openrouter-chat-001" \
-d '{
"model": "openai/gpt-4.1-mini",
"messages": [
{ "role": "user", "content": "Run a governed provider chat test." }
]
}'
Model scopes are provider-aware. OpenAI keys can use bare OpenAI model IDs such as gpt-4.1-mini; non-OpenAI chat keys must grant {provider}/{model}, such as openrouter/openai/gpt-4.1-mini, azure/gpt-4.1-mini, or vertexai/google/gemini-2.0-flash.
Idempotency
Send an idempotency key with gateway requests when your client can retry:
idempotency-key: request-2026-05-06-0001
Spendra uses idempotency across request intake, reservation, settlement, spend event creation, ledger booking, and outbox processing. Replayed requests must never double-book ledger entries or double-increment budget counters.
If a retry arrives while the first request is still pending, Spendra returns HTTP 409 with idempotency_in_progress. If the original request already settled, Spendra returns HTTP 409 with settlement metadata and idempotency_replay_unavailable because prompt and response bodies are not retained for replay. Failed reservations are not replayed.
Responses API
curl https://api.spendra.example/v1/responses \
-H "authorization: Bearer $SPENDRA_API_KEY" \
-H "content-type: application/json" \
-H "idempotency-key: setup-test-001" \
-d '{
"model": "gpt-4.1-mini",
"input": "Run a Spendra gateway setup test.",
"max_output_tokens": 120
}'
The gateway preserves the provider-native request and response shape for the original request where possible. Spendra records metadata needed for policy, reservation, settlement, audit, and ledgering without storing prompt or response bodies. A settled idempotency replay does not call the provider or book spend again, but it also does not replay the original LLM body.
curl https://api.spendra.example/v1/responses \
-H "authorization: Bearer $SPENDRA_API_KEY" \
-H "content-type: application/json" \
-H "idempotency-key: image-test-001" \
-d '{
"model": "gpt-4.1-mini",
"input": [{
"role": "user",
"content": [
{ "type": "input_image", "image_url": "https://example.com/chart.png", "detail": "high" },
{ "type": "input_text", "text": "Summarize the chart." }
]
}]
}'
curl https://api.spendra.example/v1/responses \
-H "authorization: Bearer $SPENDRA_API_KEY" \
-H "content-type: application/json" \
-H "idempotency-key: pdf-test-001" \
-d '{
"model": "gpt-4.1-mini",
"input": [{
"role": "user",
"content": [
{ "type": "input_file", "filename": "report.pdf", "file_data": "data:application/pdf;base64,..." },
{ "type": "input_text", "text": "List the key financial controls." }
]
}]
}'
Files API
Upload a file through Spendra when your client uses OpenAI file IDs:
curl https://api.spendra.example/v1/files \
-H "authorization: Bearer $SPENDRA_API_KEY" \
-F purpose="user_data" \
-F file="@report.pdf"
Then reference the returned file ID in a Responses request:
curl https://api.spendra.example/v1/responses \
-H "authorization: Bearer $SPENDRA_API_KEY" \
-H "content-type: application/json" \
-H "idempotency-key: file-id-test-001" \
-d '{
"model": "gpt-4.1-mini",
"input": [{
"role": "user",
"content": [
{ "type": "input_file", "file_id": "file_abc123" },
{ "type": "input_text", "text": "Summarize this document." }
]
}]
}'
Spendra stores file metadata only. Uploaded file bytes pass through to OpenAI and are not retained by Spendra.
Chat Completions API
curl https://api.spendra.example/v1/chat/completions \
-H "authorization: Bearer $SPENDRA_API_KEY" \
-H "content-type: application/json" \
-H "idempotency-key: chat-test-001" \
-d '{
"model": "gpt-4.1-mini",
"messages": [
{ "role": "user", "content": "Run a Spendra chat gateway setup test." }
]
}'
The base /v1/chat/completions route proxies to OpenAI. Use /v1/providers/{provider}/chat/completions when routing chat requests to OpenRouter, Google Gemini API, Vertex AI, Azure OpenAI, or Anthropic.
Spendra also exposes governed tool-call routing:
curl https://api.spendra.example/v1/tools/web-search/call \
-H "authorization: Bearer $SPENDRA_API_KEY" \
-H "content-type: application/json" \
-H "idempotency-key: tool-test-001" \
-d '{
"input": {
"query": "quarterly AI usage"
}
}'
Tool availability depends on the organization’s tool allowlist and the scoped key permissions.
Blocking behavior
Spendra blocks before the upstream provider call when:
- The key is invalid, inactive, expired, or revoked.
- The key scope does not allow the requested provider, model, tool, project, or actor.
- No explicit hard-cap policy permits the spend.
- An applicable hard cap cannot reserve enough budget.
Blocked requests create request and audit metadata, but no spend event or booked ledger entry.
Streaming
Streaming requests pass provider chunks through without buffering the full response. Spendra settles with exact usage when the provider returns authoritative usage. If a stream disconnects before usage is available, Spendra records estimated or missing usage confidence and can later reconcile it.