Gateway integration

Spendra’s gateway is designed to feel like an OpenAI-compatible endpoint while enforcing Spendra policies before provider spend occurs. OpenAI-compatible Responses, Files, and Chat Completions routes proxy to OpenAI; provider-specific Chat Completions routes can proxy to additional upstream providers.

Authentication

Gateway requests authenticate with a Spendra scoped key:

Authorization: Bearer spk_live_<key_id>_<secret>

The secret is shown once at key creation. Store it in the client or agent secret manager, not in source control.

OpenAI SDK setup

OpenAI integrations keep the OpenAI SDK and change the API key and base URL:

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.SPENDRA_API_KEY,
  baseURL: "https://api.spendra.example/v1",
});

Use the API hostname from your Spendra deployment. In production, prefer a stable internal or customer-owned domain instead of an ephemeral platform URL.

Provider-specific chat

Spendra supports provider-specific Chat Completions routing at:

POST /v1/providers/{provider}/chat/completions

Supported provider IDs for this route are openai, openrouter, google, vertexai, azure, and anthropic. Example:

curl https://api.spendra.example/v1/providers/openrouter/chat/completions \
  -H "authorization: Bearer $SPENDRA_API_KEY" \
  -H "content-type: application/json" \
  -H "idempotency-key: openrouter-chat-001" \
  -d '{
    "model": "openai/gpt-4.1-mini",
    "messages": [
      { "role": "user", "content": "Run a governed provider chat test." }
    ]
  }'

Model scopes are provider-aware. OpenAI keys can use bare OpenAI model IDs such as gpt-4.1-mini; non-OpenAI chat keys must grant {provider}/{model}, such as openrouter/openai/gpt-4.1-mini, azure/gpt-4.1-mini, or vertexai/google/gemini-2.0-flash.

Idempotency

Send an idempotency key with gateway requests when your client can retry:

idempotency-key: request-2026-05-06-0001

Spendra uses idempotency across request intake, reservation, settlement, spend event creation, ledger booking, and outbox processing. Replayed requests must never double-book ledger entries or double-increment budget counters. If a retry arrives while the first request is still pending, Spendra returns HTTP 409 with idempotency_in_progress. If the original request already settled, Spendra returns HTTP 409 with settlement metadata and idempotency_replay_unavailable because prompt and response bodies are not retained for replay. Failed reservations are not replayed.

Responses API

curl https://api.spendra.example/v1/responses \
  -H "authorization: Bearer $SPENDRA_API_KEY" \
  -H "content-type: application/json" \
  -H "idempotency-key: setup-test-001" \
  -d '{
    "model": "gpt-4.1-mini",
    "input": "Run a Spendra gateway setup test.",
    "max_output_tokens": 120
  }'

The gateway preserves the provider-native request and response shape for the original request where possible. Spendra records metadata needed for policy, reservation, settlement, audit, and ledgering without storing prompt or response bodies. A settled idempotency replay does not call the provider or book spend again, but it also does not replay the original LLM body.

Image input

curl https://api.spendra.example/v1/responses \
  -H "authorization: Bearer $SPENDRA_API_KEY" \
  -H "content-type: application/json" \
  -H "idempotency-key: image-test-001" \
  -d '{
    "model": "gpt-4.1-mini",
    "input": [{
      "role": "user",
      "content": [
        { "type": "input_image", "image_url": "https://example.com/chart.png", "detail": "high" },
        { "type": "input_text", "text": "Summarize the chart." }
      ]
    }]
  }'

Base64 PDF input

curl https://api.spendra.example/v1/responses \
  -H "authorization: Bearer $SPENDRA_API_KEY" \
  -H "content-type: application/json" \
  -H "idempotency-key: pdf-test-001" \
  -d '{
    "model": "gpt-4.1-mini",
    "input": [{
      "role": "user",
      "content": [
        { "type": "input_file", "filename": "report.pdf", "file_data": "data:application/pdf;base64,..." },
        { "type": "input_text", "text": "List the key financial controls." }
      ]
    }]
  }'

Files API

Upload a file through Spendra when your client uses OpenAI file IDs:

curl https://api.spendra.example/v1/files \
  -H "authorization: Bearer $SPENDRA_API_KEY" \
  -F purpose="user_data" \
  -F file="@report.pdf"

Then reference the returned file ID in a Responses request:

curl https://api.spendra.example/v1/responses \
  -H "authorization: Bearer $SPENDRA_API_KEY" \
  -H "content-type: application/json" \
  -H "idempotency-key: file-id-test-001" \
  -d '{
    "model": "gpt-4.1-mini",
    "input": [{
      "role": "user",
      "content": [
        { "type": "input_file", "file_id": "file_abc123" },
        { "type": "input_text", "text": "Summarize this document." }
      ]
    }]
  }'

Spendra stores file metadata only. Uploaded file bytes pass through to OpenAI and are not retained by Spendra.

Chat Completions API

curl https://api.spendra.example/v1/chat/completions \
  -H "authorization: Bearer $SPENDRA_API_KEY" \
  -H "content-type: application/json" \
  -H "idempotency-key: chat-test-001" \
  -d '{
    "model": "gpt-4.1-mini",
    "messages": [
      { "role": "user", "content": "Run a Spendra chat gateway setup test." }
    ]
  }'

The base /v1/chat/completions route proxies to OpenAI. Use /v1/providers/{provider}/chat/completions when routing chat requests to OpenRouter, Google Gemini API, Vertex AI, Azure OpenAI, or Anthropic.

Tool calls

Spendra also exposes governed tool-call routing:

curl https://api.spendra.example/v1/tools/web-search/call \
  -H "authorization: Bearer $SPENDRA_API_KEY" \
  -H "content-type: application/json" \
  -H "idempotency-key: tool-test-001" \
  -d '{
    "input": {
      "query": "quarterly AI usage"
    }
  }'

Tool availability depends on the organization’s tool allowlist and the scoped key permissions.

Blocking behavior

Spendra blocks before the upstream provider call when:

The key is invalid, inactive, expired, or revoked.
The key scope does not allow the requested provider, model, tool, project, or actor.
No explicit hard-cap policy permits the spend.
An applicable hard cap cannot reserve enough budget.

Blocked requests create request and audit metadata, but no spend event or booked ledger entry.

Streaming

Streaming requests pass provider chunks through without buffering the full response. Spendra settles with exact usage when the provider returns authoritative usage. If a stream disconnects before usage is available, Spendra records estimated or missing usage confidence and can later reconcile it.

Start

Deploy

Integrate

Gateway integration

Authentication

OpenAI SDK setup

Provider-specific chat

Idempotency

Responses API

Image input

Base64 PDF input

Files API

Chat Completions API

Tool calls

Blocking behavior

Streaming

Start

Deploy

Integrate

Documentation Index

​Authentication

​OpenAI SDK setup

​Provider-specific chat

​Idempotency

​Responses API

​Image input

​Base64 PDF input

​Files API

​Chat Completions API

​Tool calls

​Blocking behavior

​Streaming

Authentication

OpenAI SDK setup

Provider-specific chat

Idempotency

Responses API

Image input

Base64 PDF input

Files API

Chat Completions API

Tool calls

Blocking behavior

Streaming