# UniGateway LLM + Agent Guide UniGateway is an OpenAI-compatible unified AI gateway. Single endpoint, multiple providers/models. Brand positioning: unified AI gateway, OpenAI-compatible multi-provider gateway. Base API URL: https://api.unigateway.ai/v1 Docs Home: https://unigateway.ai/docs Models: https://unigateway.ai/models Pricing: https://unigateway.ai/pricing LLM Full Context: https://unigateway.ai/llms-full.txt ## Published docs index - [UniGateway 概览](https://unigateway.ai/docs/overview) - [快速开始](https://unigateway.ai/docs/quickstart) - [账户与 API Key](https://unigateway.ai/docs/account-and-api-keys) - [使用方案与定价](https://unigateway.ai/docs/usage-plans-and-pricing) - [鉴权](https://unigateway.ai/docs/authentication) - [OpenAI Chat Completions API](https://unigateway.ai/docs/chat-completions) - [模型选择与回退](https://unigateway.ai/docs/model-selection-and-fallback) - [OpenAI Responses API](https://unigateway.ai/docs/openai-responses-api) - [Anthropic Messages API](https://unigateway.ai/docs/anthropic-messages-api) - [Gemini generateContent API](https://unigateway.ai/docs/gemini-generate-content) - [图像概览](https://unigateway.ai/docs/images-overview) - [图像生成与编辑](https://unigateway.ai/docs/image-generation) - [OpenAI Images API](https://unigateway.ai/docs/openai-images-api) - [Gemini Images API](https://unigateway.ai/docs/gemini-images-api) - [视频生成](https://unigateway.ai/docs/video-generation-landing) - [Sora 视频生成](https://unigateway.ai/docs/sora-overview) - [Seedance](https://unigateway.ai/docs/seedance-overview) - [创建任务](https://unigateway.ai/docs/seedance-create-task) - [查询任务](https://unigateway.ai/docs/seedance-task-query) - [素材库](https://unigateway.ai/docs/seedance-asset-libraries) - [OpenAI Embeddings API](https://unigateway.ai/docs/embeddings) - [语音](https://unigateway.ai/docs/audio) - [模型查询](https://unigateway.ai/docs/models) - [错误处理与重试](https://unigateway.ai/docs/error-handling-and-retries) - [OpenAI SDK 接入](https://unigateway.ai/docs/openai-sdk) - [Dify 接入](https://unigateway.ai/docs/dify) - [OpenWebUI 接入](https://unigateway.ai/docs/openwebui) - [编程工具与多步工作流](https://unigateway.ai/docs/coding-tools-and-agents) - [LobeChat 接入](https://unigateway.ai/docs/lobechat) - [n8n 接入](https://unigateway.ai/docs/n8n) - [LangChain 接入](https://unigateway.ai/docs/langchain) - [Cherry Studio 接入](https://unigateway.ai/docs/cherry-studio) - [Flowise 接入](https://unigateway.ai/docs/flowise) - [Continue 接入](https://unigateway.ai/docs/continue) - [Cline 接入](https://unigateway.ai/docs/cline) - [请求日志](https://unigateway.ai/docs/request-logs) - [成本分析](https://unigateway.ai/docs/cost-analytics) - [用量分析](https://unigateway.ai/docs/usage-analytics) - [模型定价](https://unigateway.ai/docs/model-pricing) - [接口兼容矩阵](https://unigateway.ai/docs/endpoint-compatibility) - [错误码](https://unigateway.ai/docs/error-codes-reference) ## LLM-friendly markdown endpoints - Canonical docs URL: https://unigateway.ai/docs/{slug} - Markdown mirror URL: https://unigateway.ai/docs/{slug}.md ## Agent maintenance protocol (JWT) Use Admin JWT for create/update/delete operations. ### 1) Login and get JWT POST /api/auth/login Content-Type: application/json { "email": "", "password": "" } Read token from response field `token`. Then use header: Authorization: Bearer ### 2) Read documentation - GET /api/docs - GET /api/docs/{slug} ### 3) Manage categories (Admin JWT required) - GET /api/admin/doc-categories - POST /api/admin/doc-categories - PUT /api/admin/doc-categories/{id} - DELETE /api/admin/doc-categories/{id} ### 4) Manage docs (Admin JWT required) - POST /api/docs - PUT /api/docs/{slug} - DELETE /api/docs/{slug} ### 5) Update this llms.txt through API (Admin JWT required) - GET /api/admin/llms-txt - PUT /api/admin/llms-txt PUT payload: { "content": "# UniGateway LLM + Agent Guide ..." } ## Authoring conventions for agents 1) Slug: lowercase + hyphen only (`^[a-z0-9-]+$`). 2) Keep both EN/ZH content synchronized. 3) Keep code examples runnable and OpenAI-compatible. 4) Prefer additive updates; avoid deleting existing docs unless explicitly requested. 5) Keep the first 160 chars concise for metadata extraction. --- # Full Documentation Content ## Getting Started # UniGateway Overview > Category: Getting Started | Last updated: 2026-05-15 Task-oriented entry point for UniGateway endpoints, model IDs, and first integration paths. # UniGateway Overview UniGateway is a unified AI gateway. API root: `https://api.unigateway.ai`. One integration point gives access to OpenAI, Anthropic, and Google model families. ## Start by Task | Goal | Use this endpoint | Start here | |---|---|---| | Chat or text generation | `POST /v1/chat/completions` | [Quickstart](./quickstart.md) | | Stateful agent workflow | `POST /v1/responses` | [OpenAI Responses API](./openai-responses-api.md) | | Claude-native request shape | `POST /v1/messages` | [Anthropic Messages API](./anthropic-messages-api.md) | | Gemini-native request shape | `POST /v1beta/models/{model}:generateContent` | [Gemini Text Chat](./gemini-generate-content.md) | | Image generation or editing | `POST /v1/images/generations` or Gemini `generateContent` | [Image Generation and Editing](./image-generation.md) | | Audio transcription or translation | `POST /v1/audio/transcriptions` or `/v1/audio/translations` | [Audio](./audio.md) | | Video generation | `https://video.unigateway.ai` or `/v1/videos` | [Video Generation](./video-generation.md) | | Third-party app integration | OpenAI-compatible base URL | [OpenAI SDK](./openai-sdk.md) | ## Model Families - OpenAI (GPT) - Anthropic (Claude) - Google (Gemini) ## Base URLs - API root: `https://api.unigateway.ai` - Compatible paths: `/v1`, `/v1beta` ## Core Endpoints - `GET /v1/models` — list available models - `POST /v1/chat/completions` — chat and text generation ## Supported Endpoints | Path | Purpose | |---|---| | `GET /v1/models` | list available models | | `POST /v1/chat/completions` | chat and text generation | | `POST /v1/responses` | enhanced conversation interface | | `POST /v1/embeddings` | vector embeddings | | `POST /v1/images/*` | image generation and editing | | `POST /v1/audio/transcriptions` | audio transcription | | `POST /v1/audio/translations` | audio translation | | `POST /v1/messages` | Anthropic-compatible requests | | `POST /v1beta/models/{model}:generateContent` | Gemini-compatible requests | Parameter support varies by model family. Check capability against your target model. ## Model IDs Use the exact model ID returned by `GET /v1/models`. Use the exact model ID from the API response. Model library display names (such as Nano Banana Pro, Nano Banana 2) are product nicknames for readability. For example, Nano Banana names map to Gemini image model IDs such as `gemini-3-pro-image-preview`. Example IDs — verify with a live query: - `gpt-5.4` - `claude-sonnet-4-6` - `gemini-3-pro-preview` See [Quickstart](./quickstart.md) for an integration walkthrough. --- ## Code examples ### curl ```curl curl https://api.unigateway.ai/v1/chat/completions \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5.4", "messages": [ {"role": "user", "content": "Say hello from UniGateway."} ] }' ``` ### python ```python from openai import OpenAI client = OpenAI( api_key="", base_url="https://api.unigateway.ai/v1", ) resp = client.chat.completions.create( model="gpt-5.4", messages=[{"role": "user", "content": "Say hello from UniGateway."}], ) print(resp.choices[0].message.content) ``` ### typescript ```typescript import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1", }); const resp = await client.chat.completions.create({ model: "gpt-5.4", messages: [{ role: "user", content: "Say hello from UniGateway." }], }); console.log(resp.choices[0]?.message?.content); ``` # Quickstart > Category: Getting Started | Last updated: 2026-05-15 Create an API key, choose the right endpoint by task, and make the first successful request. # Quickstart From zero to your first working API response in four steps. ## Step 1 — Sign Up 1. Go to the [UniGateway login page](https://unigateway.ai/login) 2. Sign in with your email address After signing in you land on the dashboard where you can manage keys, view usage, and configure routing. ## Step 2 — Top Up Your Balance Go to **Settings → Billing** to top up your balance. See [Usage Plans & Pricing](./usage-plans-and-pricing.md) for details. ## Step 3 — Get an API Key 1. Open **Settings → API Keys** in the UniGateway console 2. Click **Create Key** 3. Copy the key immediately — it is shown only once ```bash export UNIGATEWAY_API_KEY="" ``` > Store API keys in environment variables or `.env` files. Do not commit them to version control. For key rotation and multi-key strategies, see [Account & API Keys](./account-and-api-keys.md). ## Step 4 — Make Your First Request UniGateway supports three API protocols. Pick the one you are most familiar with. If you already know what you want to build, use this shortcut table: | Goal | Endpoint | Recommended first model | |---|---|---| | Chat or text generation | `/v1/chat/completions` | `gpt-5.4` | | Claude-native messages | `/v1/messages` | `claude-sonnet-4-6` | | Gemini-native text | `/v1beta/models/gemini-3-pro-preview:generateContent` | `gemini-3-pro-preview` | | Image generation | `/v1/images/generations` | `gpt-image-2` | | Gemini image generation | `/v1beta/models/gemini-3-pro-image-preview:generateContent` | `gemini-3-pro-image-preview` | | Audio transcription | `/v1/audio/transcriptions` | `whisper-1` | | Audio translation | `/v1/audio/translations` | `whisper-1` | Always confirm the model ID with `GET /v1/models` before putting it in production. ### Protocol 1: OpenAI Chat Completions Base URL: `https://api.unigateway.ai/v1` ```bash curl https://api.unigateway.ai/v1/chat/completions \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5.4", "messages": [ {"role": "user", "content": "What is the meaning of life?"} ] }' ``` **Python (OpenAI SDK)** ```bash pip install openai ``` ```python from openai import OpenAI client = OpenAI( api_key="", base_url="https://api.unigateway.ai/v1", ) completion = client.chat.completions.create( model="gpt-5.4", messages=[{"role": "user", "content": "What is the meaning of life?"}], ) print(completion.choices[0].message.content) ``` **TypeScript (OpenAI SDK)** ```bash npm install openai ``` ```typescript import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1", }); const completion = await client.chat.completions.create({ model: "gpt-5.4", messages: [{ role: "user", content: "What is the meaning of life?" }], }); console.log(completion.choices[0].message.content); ``` ### Protocol 2: Anthropic Messages Base URL: `https://api.unigateway.ai/v1` ```bash curl https://api.unigateway.ai/v1/messages \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -H "Anthropic-Version: 2023-06-01" \ -d '{ "model": "claude-sonnet-4-6", "max_tokens": 1024, "messages": [ {"role": "user", "content": "What is the meaning of life?"} ] }' ``` **Python (Anthropic SDK)** ```bash pip install anthropic ``` ```python import anthropic client = anthropic.Anthropic( api_key="", base_url="https://api.unigateway.ai/v1", ) message = client.messages.create( model="claude-sonnet-4-6", max_tokens=1024, messages=[{"role": "user", "content": "What is the meaning of life?"}], ) print(message.content[0].text) ``` **TypeScript (Anthropic SDK)** ```bash npm install @anthropic-ai/sdk ``` ```typescript import Anthropic from "@anthropic-ai/sdk"; const client = new Anthropic({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1", }); const message = await client.messages.create({ model: "claude-sonnet-4-6", max_tokens: 1024, messages: [{ role: "user", content: "What is the meaning of life?" }], }); console.log(message.content[0].text); ``` ### Protocol 3: Google Gemini Base URL: `https://api.unigateway.ai/v1beta` ```bash curl https://api.unigateway.ai/v1beta/models/gemini-3-pro-preview:generateContent \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "contents": [ {"parts": [{"text": "What is the meaning of life?"}]} ] }' ``` **Python (plain requests)** ```bash pip install requests ``` ```python import requests api_key = "" resp = requests.post( "https://api.unigateway.ai/v1beta/models/gemini-3-pro-preview:generateContent", headers={ "Authorization": f"Bearer {api_key}", "Content-Type": "application/json", }, json={ "contents": [ {"parts": [{"text": "What is the meaning of life?"}]} ] }, ) resp.raise_for_status() print(resp.json()["candidates"][0]["content"]["parts"][0]["text"]) ``` ```typescript const resp = await fetch( "https://api.unigateway.ai/v1beta/models/gemini-3-pro-preview:generateContent", { method: "POST", headers: { Authorization: `Bearer ${process.env.UNIGATEWAY_API_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ contents: [{ parts: [{ text: "What is the meaning of life?" }] }], }), }, ); if (!resp.ok) { throw new Error(await resp.text()); } const data = await resp.json(); console.log(data.candidates[0].content.parts[0].text); ``` ## Find Model IDs Every model on UniGateway has a unique ID. You can browse available models in the console or query the API: ```bash curl https://api.unigateway.ai/v1/models \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` Pick a model ID from the response. Example IDs — verify with a live query: | Family | Example ID | |---|---| | OpenAI | `gpt-5.4` | | Anthropic | `claude-sonnet-4-6` | | Google | `gemini-3-pro-preview` | Use the exact model ID returned by `GET /v1/models`. Use the exact model ID from the API response. Model library display names are not always requestable model IDs. Use the `id` field from the API response in your requests. ## Next Steps - [Authentication](./authentication.md) — header format, key management, troubleshooting - [Model Selection and Fallback](./model-selection-and-fallback.md) — build production fallback chains - [Streaming](./streaming.md) — enable incremental output with SSE - [Audio](./audio.md) — speech transcription and translation - [Error Codes](./error-codes.md) — complete error reference with resolution steps --- ## Code examples ### curl ```curl curl https://api.unigateway.ai/v1/chat/completions \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5.4", "messages": [ {"role": "system", "content": "You are a concise assistant."}, {"role": "user", "content": "Write a 1-line product tagline for UniGateway."} ], "temperature": 0.3 }' ``` ### python ```python from openai import OpenAI client = OpenAI( api_key="", base_url="https://api.unigateway.ai/v1", ) resp = client.chat.completions.create( model="gpt-5.4", messages=[ {"role": "system", "content": "You are a concise assistant."}, {"role": "user", "content": "Write a 1-line product tagline for UniGateway."}, ], temperature=0.3, ) print(resp.choices[0].message.content) ``` ### typescript ```typescript import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1", }); const resp = await client.chat.completions.create({ model: "gpt-5.4", messages: [ { role: "system", content: "You are a concise assistant." }, { role: "user", content: "Write a 1-line product tagline for UniGateway." } ], temperature: 0.3, }); console.log(resp.choices[0]?.message?.content); ``` # Account & API Keys > Category: Getting Started | Last updated: 2026-05-15 Sign-in methods, key creation, key types, security practices, rotation, and multi-key strategy. # Account & API Keys Manage your UniGateway account, create API keys, and keep them secure. ## Sign In UniGateway supports email-based sign-in. Go to the [UniGateway login page](https://unigateway.ai/login) to get started. ## Create an API Key You need at least one API key to call any UniGateway endpoint. 1. Sign in to the [UniGateway console](https://unigateway.ai/dashboard) 2. Navigate to **Settings → API Keys** 3. Click **Create Key** 4. Enter a descriptive name (e.g., `production-chat`, `staging-rag`) 5. Select the key type: - **Standard key** — for API calls (chat, embeddings, images, etc.) - **Management key** — for Platform API calls (balance, usage, statistics) 6. Copy the key immediately — the full value is shown only once > If you lose a key, delete it and create a new one. There is no way to recover a key after the initial display. ## Verify a Key ```bash curl https://api.unigateway.ai/v1/models \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` A successful model list response means the key is valid and active. ## Key Types | Type | Purpose | Header | |---|---|---| | Standard | Call AI model endpoints (chat, embeddings, images, audio, video) | `Authorization: Bearer ` | | Management | Query account balance and usage statistics | `Authorization: Bearer ` | Management keys have separate permissions and should be stored with stricter access controls. ## Key Security ### Do - Store keys in server-side environment variables or a secrets manager - Use different keys for each environment (staging, production) - Use different keys for each application or service - Rotate keys on a regular schedule - Delete compromised keys immediately ### Do Not - Commit keys to source control (Git, Mercurial, etc.) - Embed keys in client-side JavaScript or mobile apps - Share keys between team members — create one per person - Post keys in chat, email, or issue trackers - Reuse a key after suspected exposure ## Key Rotation When you need to rotate a key without downtime: 1. Create a new key in the console 2. Deploy the new key to your application 3. Verify that requests succeed with the new key 4. Delete the old key The old key remains active until you delete it, so there is no service interruption during the switch. ## Multi-Key Strategy | Strategy | When to Use | |---|---| | One key per environment | Separate staging and production traffic and billing | | One key per service | Isolate usage and errors per microservice | | One key per team member | Audit individual API usage | | Management key for ops only | Keep billing and stats access separate from model calls | ## Account Settings ### Email and Password Update your email or password in **Settings → Profile**. ### Two-Factor Authentication Enable 2FA in **Settings → Security** for additional account protection. ### Notification Preferences Configure billing alerts and usage thresholds in **Settings → Notifications**. ## Common Issues | Symptom | Cause | Resolution | |---|---|---| | `401 Unauthorized` on every request | Key missing, invalid, or revoked | Verify the key in the console; recreate if necessary | | `403 Forbidden` with a valid key | Key lacks permission for the endpoint | Check key type — management endpoints require a management key | | Key works locally but not in CI | Environment variable not set in CI | Add the key to CI secrets, not to the build script | | Requests hit the wrong provider | `base_url` not set correctly | Confirm `base_url` / `baseURL` is `https://api.unigateway.ai/v1` | --- ## Code examples ### curl ```curl curl https://api.unigateway.ai/v1/chat/completions \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5.4", "messages": [{"role": "user", "content": "Hello"}] }' ``` ### python ```python from openai import OpenAI client = OpenAI( api_key="", base_url="https://api.unigateway.ai/v1", ) resp = client.chat.completions.create( model="gpt-5.4", messages=[{"role": "user", "content": "Hello"}], ) ``` ### typescript ```typescript import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1", }); ``` # Usage Plans & Pricing > Category: Getting Started | Last updated: 2026-05-15 Pay As You Go vs Subscription, plan comparison, billing, quotas, top-up, invoices, and FAQ. # Usage Plans & Pricing Understand how billing works for the Pay As You Go model. ## Pay As You Go ### How It Works 1. Top up your balance in **Settings → Billing** 2. API calls consume tokens and deduct from your balance in real time 3. Each model has a per-token price — see the [Models page](https://unigateway.ai/models) for current rates 4. When your balance reaches zero, API calls return `402` ### Advantages - No rate limits — handle any concurrency level - Per-token billing — pay only for what you use - Access to all models on the platform - Suitable for production and commercial workloads ### Top Up - Supported payment methods: credit card (Stripe), Alipay - Minimum top-up amount: $5 - Balance does not expire ### Monitor Balance Check your balance via the console or the Platform API: ```bash curl https://api.unigateway.ai/v1/management/balance \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY (management key type)" ``` ## Billing ### Token Pricing Each model has a per-million-token rate split into input and output tokens. Visit the [Models page](https://unigateway.ai/models) for live pricing. ### Invoice Download Go to **Settings → Billing → Invoices** to download past invoices. ### Billing Alerts Configure low-balance alerts in **Settings → Notifications**. Recommended thresholds: | Alert | Threshold | |---|---| | Low balance warning | 20% of last top-up | | Critical balance warning | 5% of last top-up | ## FAQ **Q: What happens when my Pay As You Go balance hits zero?** A: API calls return `402 insufficient_credit`. Top up to resume service. **Q: Are there free models?** A: Some models offer a free tier with rate limits. Check the [Models page](https://unigateway.ai/models) for details. **Q: How do I check my current usage?** A: Use the [Observability](./observability/usage-analytics.md) section in the console, or query the Platform API. # Authentication > Category: Getting Started | Last updated: 2026-05-15 How to send API keys, set headers, and avoid common authentication mistakes. # Authentication All API requests require an API key in the `Authorization` header. ## Auth Header ```http Authorization: Bearer ``` ## Verify ```bash curl https://api.unigateway.ai/v1/models \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` A model list response means authentication is working. > Keep API keys in server-side environment variables. Do not expose them in front-end code or public repositories. ## JSON POST Requests POST requests also require the `Content-Type` header: ```bash curl https://api.unigateway.ai/v1/chat/completions \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5.4", "messages": [{"role": "user", "content": "Hello"}] }' ``` ## Common Errors | Status | Cause | Resolution | |---|---|---| | `401` | Key missing, invalid, or expired | Check `Authorization` header format | | `403` | Key exists but lacks permission | Check account permissions | | `429` | Rate limit or quota exceeded | Add backoff and retry logic | ## Troubleshooting 1. `Authorization` header must be exactly `Bearer ` 2. Base URL must be `https://api.unigateway.ai/v1` 3. `model` value must come from `GET /v1/models` ## Key Management - Use separate keys for test and production environments - Rotate keys if leakage is suspected --- ## Code examples ### curl ```curl curl https://api.unigateway.ai/v1/models \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` ### python ```python from openai import OpenAI client = OpenAI( api_key="", base_url="https://api.unigateway.ai/v1", ) ``` ### typescript ```typescript import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1", }); ``` ## Text Generation # Chat Completions > Category: Text Generation | Last updated: 2026-05-15 Send text prompts through UniGateway's unified chat completions API across mainstream model families. # Chat Completions Text generation and multi-turn conversation through the OpenAI-compatible chat completions API. ## Endpoint | Item | Value | |---|---| | Method | `POST` | | Path | `/v1/chat/completions` | | Base URL | `https://api.unigateway.ai/v1` | | Auth | `Authorization: Bearer $UNIGATEWAY_API_KEY` | | Content-Type | `application/json` | ## Minimal Request ```bash curl https://api.unigateway.ai/v1/chat/completions \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5.4", "messages": [ {"role": "user", "content": "Explain AI gateway benefits in 3 bullets."} ] }' ``` ## Python (OpenAI SDK) ```python from openai import OpenAI client = OpenAI( api_key="", base_url="https://api.unigateway.ai/v1", ) resp = client.chat.completions.create( model="gpt-5.4", messages=[{"role": "user", "content": "Summarize in 3 bullets."}], ) print(resp.choices[0].message.content) ``` ## TypeScript (OpenAI SDK) ```typescript import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1", }); const resp = await client.chat.completions.create({ model: "gpt-5.4", messages: [{ role: "user", content: "Summarize in 3 bullets." }], }); console.log(resp.choices[0].message.content); ``` ## Parameters | Field | Type | Required | Description | |---|---|---|---| | `model` | string | Yes | Exact model ID from `GET /v1/models` | | `messages` | array | Yes | Chat message array; each element has `role` and `content` | | `temperature` | number | No | Creativity control; range varies by model | | `max_tokens` | number | No | Upper bound of generated tokens | | `stream` | boolean | No | Set `true` to enable SSE streaming | | `top_p` | number | No | Nucleus sampling parameter | | `frequency_penalty` | number | No | Reduce word repetition | | `presence_penalty` | number | No | Encourage topic diversity | | `stop` | string/array | No | Stop sequence(s) | | `user` | string | No | End-user identifier for abuse monitoring | ## Multi-Model Examples ### GPT ```bash curl https://api.unigateway.ai/v1/chat/completions \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5.4", "messages": [ {"role": "system", "content": "You are a concise assistant."}, {"role": "user", "content": "Write a migration note from single-provider to UniGateway."} ], "temperature": 0.3 }' ``` ### Claude (via OpenAI-compatible) ```bash curl https://api.unigateway.ai/v1/chat/completions \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "claude-sonnet-4-6", "messages": [ {"role": "user", "content": "Explain quantum computing in one sentence."} ], "max_tokens": 200 }' ``` ### Gemini (via OpenAI-compatible) ```bash curl https://api.unigateway.ai/v1/chat/completions \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gemini-3-pro-preview", "messages": [ {"role": "user", "content": "Explain AI gateway routing in 3 bullets."} ] }' ``` > For Claude-native features like `thinking` blocks and `cache_control`, use `POST /v1/messages`. For Gemini-native features like `responseModalities` and `imageConfig`, use `POST /v1beta/models/{model}:generateContent`. ## Response ```json { "id": "chatcmpl-xxx", "object": "chat.completion", "created": 1760000000, "model": "gpt-5.4", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "An AI gateway provides unified model access, simplifies billing, and enables intelligent routing." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 24, "completion_tokens": 18, "total_tokens": 42 } } ``` | Response field | Description | |---|---| | `choices[].message.content` | Generated text | | `choices[].finish_reason` | `stop`, `length`, etc. | | `model` | Actual model that handled the request | | `usage` | Token consumption | ## Errors | Status | Cause | Resolution | |---|---|---| | `400` | Invalid request body | Check JSON structure and parameter types | | `401` | Invalid or missing API key | Verify `Authorization` header | | `404` | Model ID not found or not in plan | Re-check `GET /v1/models` | | `429` | Rate limit exceeded | Add backoff and retry | | `5xx` | Server or upstream error | Retry with exponential backoff; switch model if persistent | --- ## Code examples ### curl ```curl curl https://api.unigateway.ai/v1/chat/completions \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5.4", "messages": [ {"role": "user", "content": "Summarize the benefits of a unified AI gateway in 3 bullets."} ], "temperature": 0.2 }' ``` ### python ```python from openai import OpenAI client = OpenAI( api_key="", base_url="https://api.unigateway.ai/v1", ) resp = client.chat.completions.create( model="gpt-5.4", messages=[{"role": "user", "content": "Summarize the benefits of a unified AI gateway in 3 bullets."}], temperature=0.2, ) print(resp.choices[0].message.content) ``` ### typescript ```typescript import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1", }); const resp = await client.chat.completions.create({ model: "gpt-5.4", messages: [ { role: "user", content: "Summarize the benefits of a unified AI gateway in 3 bullets." } ], temperature: 0.2, }); console.log(resp.choices[0]?.message?.content); ``` # Model Selection and Fallback > Category: Text Generation | Last updated: 2026-05-15 How to pick model IDs and define fallback chains across OpenAI, Claude, and Gemini. # Model Selection and Fallback Run stable production traffic across model families. ## Discover First, Then Pin Read live model IDs from `GET /v1/models`, then pin IDs by use case. Model availability is account-specific; external references may differ. ## Build a Fallback Chain Example text-generation chain: 1. `gpt-5.4` 2. `claude-sonnet-4-6` 3. `gemini-3-pro-preview` Your actual chain should come from the live model list and your latency/cost targets. ## Keep Request Shape Conservative Start with the common request shape: ```json { "model": "gpt-5.4", "messages": [ { "role": "user", "content": "Summarize this in 3 bullets." } ], "temperature": 0.2 } ``` Avoid provider-specific optional fields unless you have endpoint-level validation for each fallback target. ## Routing Policy - Retry same model on transient failures (`429`, `5xx`) with exponential backoff - Switch to next model in chain after retry budget is exhausted - Log `model`, `request_id`, latency, and token usage per attempt ## Model Lifecycle Model states: `AVAILABLE`, `PREVIEW`, `DEPRECATED`, `SUNSET`, `UNAVAILABLE`. - Avoid adding new traffic to `DEPRECATED` or `SUNSET` models - Keep replacement mappings in configuration, not in application code ## Backoff Policy | Parameter | Value | |---|---| | Initial delay | `300ms` | | Multiplier | `2x` | | Max delay | `8s` | | Max attempts per model | `3` | After max attempts, route to the next fallback model. --- ## Code examples ### curl ```curl curl https://api.unigateway.ai/v1/chat/completions \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{"model":"gpt-5.4","messages":[{"role":"user","content":"hello"}]}' ``` ### python ```python from openai import OpenAI client = OpenAI(api_key="", base_url="https://api.unigateway.ai/v1") resp = client.chat.completions.create(model="claude-sonnet-4-6", messages=[{"role":"user","content":"hello"}]) print(resp.choices[0].message.content) ``` ### typescript ```typescript import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1" }); const resp = await client.chat.completions.create({ model: "gemini-3-pro-preview", messages: [{ role: "user", content: "hello" }] }); console.log(resp.choices[0]?.message?.content); ``` # OpenAI Responses API > Category: Text Generation | Last updated: 2026-05-15 Use the OpenAI Responses endpoint through UniGateway for stateful agent interactions with built-in tools. # OpenAI Responses API via UniGateway Use OpenAI's stateful Responses API through UniGateway for agent-oriented workflows, built-in tools, and server-side conversation state. ## Overview The **Responses API** is OpenAI's next-generation interface for agent-oriented interaction with built-in tools, structured output, and server-side state management. Key capabilities include: - **Stateful conversations** via `previous_response_id` — server retains history - **Built-in tools** — web search, code interpreter, computer use, file search - **Reasoning tokens** — model outputs reasoning before final response - **Structured output** — guaranteed schema-compliant JSON UniGateway exposes this at: - `POST /v1/responses` — create a response - `GET /v1/responses/{id}` — retrieve a response - `POST /v1/responses/{id}/input_items` — append to a conversation ## Authentication All requests require your UniGateway API key: ```http Authorization: Bearer ``` ## Supported Models Available OpenAI models on UniGateway that support the Responses API: | Model ID | Description | Input / 1M | Output / 1M | |----------|-------------|------------|-------------| | `gpt-5.4` | General purpose, balanced | $2.50 | $15.00 | | `gpt-5.4-pro` | Frontier reasoning & agentic | $30.00 | $180.00 | | `gpt-5.3-codex` | Code & reasoning specialist | $1.75 | $14.00 | Query the live model list: ```bash curl https://api.unigateway.ai/v1/models \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` ## Create a Response ### Basic Request ```bash curl https://api.unigateway.ai/v1/responses \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5.4", "input": "Explain quantum computing in one paragraph." }' ``` ### With Conversation History ```bash curl https://api.unigateway.ai/v1/responses \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5.4", "input": [ {"role": "user", "content": [{"type": "text", "text": "What is 2+2?"}]}, {"role": "assistant", "content": [{"type": "text", "text": "4"}]}, {"role": "user", "content": [{"type": "text", "text": "Multiply that by 10."}]} ] }' ``` ### With State (Stateful Conversations) ```bash curl https://api.unigateway.ai/v1/responses \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5.4", "input": "Multiply that by 10.", "previous_response_id": "resp_abc123" }' ``` When `previous_response_id` is set, the server manages the conversation history automatically. If omitted, state is client-managed. ## Request Parameters | Parameter | Type | Required | Description | |---|---|---|---| | `model` | string | Yes | Model ID from `GET /v1/models` | | `input` | string / array | Yes | User message text or array of message items | | `previous_response_id` | string | No | Server-managed conversation state | | `store` | boolean | No | Save response server-side; default `true` | | `tools` | array | No | Built-in tools to enable | | `tool_choice` | string/object | No | `"auto"`, `"required"`, `"none"`, or specific tool | | `instructions` | string | No | System-level instruction | | `temperature` | number | No | Sampling temperature, e.g. `0.2` | | `max_tokens` | number | No | Upper bound for output tokens | | `top_p` | number | No | Nucleus sampling | | `response_format` | object | No | Structured output schema | | `reasoning` | object | No | `reasoning_effort` control | | `parallel_tool_calls` | boolean | No | Allow multiple tool calls per turn | | `metadata` | object | No | Custom key-value pairs | ### Input Array Format Each item in the `input` array: ```json { "role": "user", // or "assistant", "system", "developer" "content": [ {"type": "text", "text": "..."}, {"type": "image_url", "image_url": {"url": "..."}} ] } ``` ## Response Format ```json { "id": "resp_abc123", "object": "response", "status": "completed", "model": "gpt-5.4", "output": [ { "type": "message", "role": "assistant", "content": [ {"type": "text", "text": "Quantum computing harnesses..."} ] } ], "usage": { "input_tokens": 12, "output_tokens": 89, "total_tokens": 101 } } ``` ### Response Fields | Field | Description | |---|---| | `id` | Response ID; use as `previous_response_id` for next turn | | `status` | `"completed"`, `"in_progress"`, `"failed"`, `"cancelled"` | | `output` | Array of output items (text, tool calls, reasoning) | | `output[].type` | `"message"`, `"reasoning"`, `"tool_call"` | ### Output Item Types | Type | Description | |---|---| | `message` | Assistant message with `role` and `content` array | | `reasoning` | Model reasoning tokens (not shown to end user) | | `tool_call` | Tool call request with `call_id`, `type`, `arguments` | ## Built-in Tools Enable by adding to the `tools` array. ### Web Search ```json { "tools": [ {"type": "web_search_preview"} ] } ``` ### Code Interpreter ```json { "tools": [ {"type": "code_interpreter"} ] } ``` ### Computer Use ```json { "tools": [ { "type": "computer_use_preview", "environment": "browser" } ] } ``` ### File Search ```json { "tools": [ {"type": "file_search"} ] } ``` > **Note:** Built-in tool availability varies by model. Verify with a test request. ## Tool Calling Flow When a tool is called, the response contains `output` items with `type: "tool_call"`: ```json { "output": [ { "type": "tool_call", "call_id": "call_xyz", "type": "web_search_preview", "arguments": "{\"query\":\"latest OpenAI model release 2026\"}" } ] } ``` Send the tool result back: ```bash curl https://api.unigateway.ai/v1/responses \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5.4", "previous_response_id": "resp_abc123", "input": [ { "type": "tool_result", "call_id": "call_xyz", "output": "OpenAI announced GPT-5.4 Pro on April 15, 2026..." } ] }' ``` ## Streaming Set `stream: true` for Server-Sent Events (SSE). ```bash curl https://api.unigateway.ai/v1/responses \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -N \ -d '{ "model": "gpt-5.4", "stream": true, "input": "Write a haiku about AI." }' ``` SSE events: - `response.created` — response object created - `response.in_progress` — generation started - `response.output_item.added` — new output item - `response.output_item.delta` — incremental content - `response.completed` — full response ready ## Structured Output Guaranteed schema-compliant JSON: ```bash curl https://api.unigateway.ai/v1/responses \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5.4", "input": "Give me details about a fictional AI startup.", "response_format": { "type": "json_schema", "json_schema": { "name": "startup", "schema": { "type": "object", "properties": { "name": {"type": "string"}, "founded": {"type": "integer"}, "valuation": {"type": "number"} }, "required": ["name", "founded", "valuation"] } } } }' ``` ## Reasoning Control Control reasoning depth (available on reasoning-capable models): ```json { "model": "gpt-5.4-pro", "input": "Solve x² + 5x + 6 = 0", "reasoning": { "reasoning_effort": "high" } } ``` | `reasoning_effort` | Behavior | |---|---| | `low` | Minimal reasoning, faster | | `medium` | Balanced | | `high` | Deep reasoning, more tokens | ## Error Handling | Status | Meaning | Action | |---|---|---| | 400 | Invalid request body | Fix JSON schema | | 401 | Authentication failed | Check API key | | 404 | Model not found | Verify from `GET /v1/models` | | 429 | Rate limited | Exponential backoff, then fallback | | 5xx | Server error | Retry with backoff, then switch model | ## SDK Usage ### Python ```python from openai import OpenAI client = OpenAI( api_key="", base_url="https://api.unigateway.ai/v1", ) # Stateless resp = client.responses.create( model="gpt-5.4", input="What is the capital of France?" ) print(resp.output[0].content[0].text) # Stateful follow-up follow_up = client.responses.create( model="gpt-5.4", input="What is its population?", previous_response_id=resp.id ) print(follow_up.output[0].content[0].text) ``` ### TypeScript ```typescript import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1", }); const resp = await client.responses.create({ model: "gpt-5.4", input: "What is the capital of France?", }); console.log(resp.output[0]?.content[0]?.text); // Stateful const followUp = await client.responses.create({ model: "gpt-5.4", input: "What is its population?", previous_response_id: resp.id, }); console.log(followUp.output[0]?.content[0]?.text); ``` ## Migration from Chat Completions | Chat Completions | Responses API | |---|---| | `messages` array | `input` string or array | | `system` role | `instructions` top-level field | | No server state | `previous_response_id` for stateful chains | | `function_call` in choices | `tool_call` in `output` array | | `choices[0].message.content` | `output[0].content[0].text` | > **Recommendation:** Use Chat Completions for simple, stateless text generation. Use Responses API for agentic workflows, built-in tools, and stateful multi-turn conversations that benefit from server-side history management. --- ## Code examples ### curl ```curl curl https://api.unigateway.ai/v1/responses \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5.2", "input": "What is 15 * 27?", "tools": [{"type": "code_interpreter"}] }' ``` ### python ```python from openai import OpenAI client = OpenAI( api_key="", base_url="https://api.unigateway.ai/v1", ) resp = client.responses.create( model="gpt-5.2", input="What is 15 * 27?", tools=[{"type": "code_interpreter"}], ) print(resp.output_text) ``` ### typescript ```typescript import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1", }); const resp = await client.responses.create({ model: "gpt-5.2", input: "What is 15 * 27?", tools: [{ type: "code_interpreter" }], }); console.log(resp.output_text); ``` # Anthropic Messages API > Category: Text Generation | Last updated: 2026-05-15 Use the Anthropic Messages endpoint through UniGateway for Claude-specific features like extended thinking and prompt caching. # Anthropic Messages API Access Claude's native extended capabilities through UniGateway at the Anthropic Messages API endpoint. ## Prerequisites - A UniGateway API key exported as `UNIGATEWAY_API_KEY` - Target Claude model ID confirmed available via `GET /v1/models` ## Overview - Endpoint: `POST /v1/messages` - Base URL: `https://api.unigateway.ai/v1` - Headers: - `Authorization: Bearer ` - `Content-Type: application/json` - `anthropic-version: 2023-06-01` UniGateway proxies the Anthropic Messages protocol. Claude-specific capabilities exposed through this endpoint include: - `thinking` extended reasoning blocks - `cache_control` prompt caching - Granular `stop_reason` values - PDF, image, citation, and tool use support ## Request ### Basic request ```bash curl https://api.unigateway.ai/v1/messages \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-sonnet-4-6", "max_tokens": 1024, "messages": [ {"role": "user", "content": "Explain quantum computing in one paragraph."} ] }' ``` ### Parameters | Field | Type | Required | Description | |---|---|---|---| | `model` | string | Yes | Exact model ID from `GET /v1/models` | | `messages` | array | Yes | Message objects with `role` and `content` | | `max_tokens` | number | Yes | Maximum tokens to generate | | `system` | string / array | No | System prompt; array form supports `cache_control` | | `tools` | array | No | Available tool definitions | | `tool_choice` | object / string | No | `auto`, `any`, `none`, or a specific tool | | `thinking` | object | No | Extended reasoning configuration | | `temperature` | number | No | Sampling temperature, 0–1 | | `top_p` | number | No | Nucleus sampling | | `top_k` | number | No | Top-K sampling | | `stop_sequences` | array | No | Array of stop strings | | `stream` | boolean | No | Enable SSE streaming | | `metadata` | object | No | Custom metadata | ### Message formats Single text message: ```json { "role": "user", "content": "Explain quantum computing." } ``` Multimodal content as an array: ```json { "role": "user", "content": [ {"type": "text", "text": "Describe this image."}, { "type": "image", "source": { "type": "base64", "media_type": "image/jpeg", "data": "/9j/4AAQSkZJRg..." } } ] } ``` Supported source types for images and documents: | Type | Description | |---|---| | `base64` | Base64-encoded data with `media_type` | | `url` | Publicly accessible image URL | PDF input uses the `document` type: ```json { "type": "document", "source": { "type": "base64", "media_type": "application/pdf", "data": "JVBERi0xLjQK..." } } ``` ### Extended reasoning (thinking) Enable Claude's extended reasoning: ```json { "model": "claude-sonnet-4-6", "max_tokens": 4096, "thinking": { "type": "enabled", "budget_tokens": 2048 }, "messages": [ {"role": "user", "content": "Prove that sqrt(2) is irrational."} ] } ``` | Field | Type | Description | |---|---|---| | `type` | string | `enabled` or `adaptive` | | `budget_tokens` | number | Token budget allocated to reasoning | When `thinking` is enabled, the response includes a `thinking` content block. `max_tokens` must be greater than `budget_tokens`. ### Prompt caching Add `cache_control` to messages or tool definitions to reduce repeated input costs: ```json { "role": "user", "content": [ { "type": "text", "text": "Long document or system prompt content...", "cache_control": {"type": "ephemeral"} } ] } ``` `cache_control.type` currently supports only `ephemeral`. Cache hits are reported in response `usage` as `cache_read_input_tokens` and `cache_creation_input_tokens`. ## Tool use Define tools in the `tools` array: ```json { "tools": [ { "name": "get_weather", "description": "Get current weather for a location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "City name" } }, "required": ["location"] } } ] } ``` Tool choice strategies: ```json { "tool_choice": {"type": "auto"} } ``` | Strategy | Behavior | |---|---| | `auto` | Model decides whether to call a tool | | `any` | Model must call at least one tool | | `none` | Do not call tools | | `tool` | Require a specific tool by name | When returning tool results, append a `tool_result` block with `role: user`: ```json { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01T1x1fJ34qAmk2tNTrN7Up6", "content": "22°C, sunny" } ] } ``` ## Response ```json { "id": "msg_01XgYfV9UTi2PJN", "type": "message", "role": "assistant", "model": "claude-sonnet-4-6", "content": [ { "type": "text", "text": "Quantum computing leverages superposition..." } ], "stop_reason": "end_turn", "usage": { "input_tokens": 14, "output_tokens": 128 } } ``` ### Response fields | Field | Description | |---|---| | `id` | Message ID | | `type` | Always `message` | | `role` | Always `assistant` | | `content` | Array of content blocks (`text`, `thinking`, `tool_use`) | | `stop_reason` | `end_turn`, `max_tokens`, `stop_sequence`, `tool_use` | | `usage.input_tokens` | Tokens consumed by the prompt | | `usage.output_tokens` | Tokens generated by the model | ## Streaming Enable SSE by setting `stream: true`: ```bash curl https://api.unigateway.ai/v1/messages \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-sonnet-4-6", "max_tokens": 1024, "messages": [{"role": "user", "content": "Hello"}], "stream": true }' ``` SSE event types include `message_start`, `content_block_start`, `content_block_delta`, `content_block_stop`, `message_delta`, and `message_stop`. ## Errors | Status | Cause | Resolution | |---|---|---| | `400` | Invalid request body | Check JSON structure and parameter types | | `401` | Invalid or missing API key | Verify `Authorization` header | | `404` | Model ID not found | Re-check `GET /v1/models` | | `429` | Rate limit exceeded | Add backoff and retry | | `529` | Overloaded | Retry with exponential backoff | --- ## Code examples ### curl ```curl curl https://api.unigateway.ai/v1/messages \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -H "Anthropic-Version: 2023-06-01" \ -d '{ "model": "claude-sonnet-4-6", "max_tokens": 1024, "messages": [{"role": "user", "content": "Explain quantum computing in one sentence."}] }' ``` ### python ```python import requests api_key = "" resp = requests.post( "https://api.unigateway.ai/v1/messages", headers={ "Authorization": f"Bearer {api_key}", "Content-Type": "application/json", "Anthropic-Version": "2023-06-01", }, json={ "model": "claude-sonnet-4-6", "max_tokens": 1024, "messages": [{"role": "user", "content": "Explain quantum computing in one sentence."}], }, ) print(resp.json()["content"][0]["text"]) ``` ### typescript ```typescript const resp = await fetch("https://api.unigateway.ai/v1/messages", { method: "POST", headers: { Authorization: `Bearer ${process.env.UNIGATEWAY_API_KEY}`, "Content-Type": "application/json", "Anthropic-Version": "2023-06-01", }, body: JSON.stringify({ model: "claude-sonnet-4-6", max_tokens: 1024, messages: [{ role: "user", content: "Explain quantum computing in one sentence." }], }), }); const data = await resp.json(); console.log(data.content[0].text); ``` # Gemini generateContent API > Category: Text Generation | Last updated: 2026-05-15 Call Gemini native generateContent through UniGateway for text generation, chat, structured output, and tool calling. # Gemini Text Chat (generateContent) Call Gemini's native `generateContent` interface through UniGateway for text generation, multi-turn chat, structured output, and tool calling. ## Prerequisites - A UniGateway API key stored in `UNIGATEWAY_API_KEY` - Confirm the target Gemini model is available via `GET /v1/models` ## Endpoint - Method: `POST` - Path: `/v1beta/models/{model}:generateContent` - Streaming path: `/v1beta/models/{model}:streamGenerateContent?alt=sse` - Base URL: `https://api.unigateway.ai` - Headers: - `Authorization: Bearer ` - `Content-Type: application/json` UniGateway uses Bearer Token authentication. Do not send `x-goog-api-key` or a `key=` query parameter. > This is the native Gemini format. Field names use Gemini-style `contents`, `parts`, `generationConfig`, and `systemInstruction`. If you use the OpenAI SDK, prefer `/v1/chat/completions`. For SDK integrations, first verify the endpoint with cURL or plain HTTP. Some Gemini SDK versions handle custom base URLs differently. ## Basic request ```bash curl "https://api.unigateway.ai/v1beta/models/gemini-3-pro-preview:generateContent" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "contents": [ { "role": "user", "parts": [ {"text": "Explain the value of an AI Gateway in one paragraph."} ] } ] }' ``` ## Request body ### Minimal structure ```json { "contents": [ { "role": "user", "parts": [{"text": "Hello"}] } ] } ``` ### Parameters | Field | Type | Required | Description | |---|---|---|---| | `contents` | array | Yes | Conversation content. Send one user content for single-turn calls, or full history for multi-turn calls. | | `contents[].role` | string | No | `user` or `model`. | | `contents[].parts` | array | Yes | Message parts. Text uses `{ "text": "..." }`. | | `systemInstruction` | object | No | System instruction, usually `parts: [{"text":"..."}]`. | | `generationConfig` | object | No | Generation parameters such as temperature, topP, maxOutputTokens, responseMimeType. | | `tools` | array | No | Tool definitions such as function calling, Google Search, and code execution. | | `toolConfig` | object | No | Tool calling policy. | | `safetySettings` | array | No | Safety settings. | ## Multi-turn chat Native Gemini `generateContent` is stateless. For multi-turn chat, send the conversation history from the client: ```json { "contents": [ { "role": "user", "parts": [{"text": "What is the capital of France?"}] }, { "role": "model", "parts": [{"text": "Paris."}] }, { "role": "user", "parts": [{"text": "What is its approximate population?"}] } ] } ``` ## System instruction ```bash curl "https://api.unigateway.ai/v1beta/models/gemini-3-pro-preview:generateContent" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "systemInstruction": { "parts": [{"text": "You are a concise technical assistant for production integrations."}] }, "contents": [ {"parts": [{"text": "Give a Redis cache penetration mitigation plan."}]} ] }' ``` ## Generation config ```json { "generationConfig": { "temperature": 0.7, "topP": 0.95, "topK": 40, "maxOutputTokens": 2048, "stopSequences": ["\n\n"] }, "contents": [ {"parts": [{"text": "Write an API gateway integration checklist."}]} ] } ``` | Field | Description | |---|---| | `temperature` | Sampling randomness. For Gemini 3 models, prefer the model default first. | | `topP` | Nucleus sampling. | | `topK` | Top-K sampling. | | `maxOutputTokens` | Maximum output tokens. | | `stopSequences` | Stop sequence array. | | `thinkingConfig` | Reasoning configuration, such as `thinkingLevel`. Support depends on the model. | ## Structured output For JSON output, set `responseMimeType` and `responseJsonSchema` in `generationConfig`: ```bash curl "https://api.unigateway.ai/v1beta/models/gemini-3-pro-preview:generateContent" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "contents": [ {"parts": [{"text": "Extract city and temperature from: Shanghai is 18 degrees Celsius today."}]} ], "generationConfig": { "responseMimeType": "application/json", "responseJsonSchema": { "type": "object", "properties": { "city": {"type": "string"}, "temperature_celsius": {"type": "number"} }, "required": ["city", "temperature_celsius"] } } }' ``` The text is usually returned at `candidates[0].content.parts[0].text` as a JSON string matching the schema. ## Function calling ```json { "contents": [ { "parts": [{"text": "Check today's weather in Tokyo."}] } ], "tools": [ { "functionDeclarations": [ { "name": "get_weather", "description": "Get current weather for a city", "parameters": { "type": "object", "properties": { "location": {"type": "string", "description": "City name"} }, "required": ["location"] } } ] } ] } ``` The model may return a `functionCall` part. Your application should execute the function and send a `functionResponse` with the conversation history in the next request. ## Built-in tools Native Gemini format supports some built-in tools. Availability depends on model and account permissions. ```json { "tools": [ {"googleSearch": {}}, {"codeExecution": {}} ], "contents": [ {"parts": [{"text": "Search and summarize recent AI Gateway trends."}]} ] } ``` Common tools: | Tool | Description | |---|---| | `googleSearch` | Search grounding. | | `codeExecution` | Code execution. | | `urlContext` | URL context retrieval. | | `functionDeclarations` | Client-side function calling. | ## Streaming Use the `:streamGenerateContent?alt=sse` path for SSE streaming: ```bash curl "https://api.unigateway.ai/v1beta/models/gemini-3-pro-preview:streamGenerateContent?alt=sse" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "contents": [ {"parts": [{"text": "Explain SSE in three sentences."}]} ] }' ``` Parse `candidates[].content.parts[].text` incrementally from SSE events. ## Response format ```json { "candidates": [ { "content": { "role": "model", "parts": [ {"text": "An AI Gateway unifies model access, routing, and billing..."} ] }, "finishReason": "STOP", "index": 0 } ], "usageMetadata": { "promptTokenCount": 12, "candidatesTokenCount": 64, "totalTokenCount": 76 } } ``` ### Response fields | Field | Description | |---|---| | `candidates[]` | Candidate outputs. | | `candidates[].content.parts[]` | Output content parts. Text output is in the `text` field. | | `candidates[].finishReason` | Finish reason, such as `STOP`. | | `usageMetadata` | Token usage metadata. | ## Errors | Status | Cause | Resolution | |---|---|---| | `400` | Request body does not match native Gemini format | Check `contents[].parts[]`, `generationConfig`, and related fields. | | `401` | Invalid or missing API key | Verify `Authorization: Bearer ...`. | | `404` | Model does not exist or is unavailable for the account | Re-check `GET /v1/models`. | | `429` | Rate limit exceeded | Add backoff and retry. | | `5xx` | Server error | Retry with exponential backoff; switch model if needed. | --- ## Code examples ### curl ```curl curl "https://api.unigateway.ai/v1beta/models/gemini-3-pro-preview:generateContent" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "contents": [ {"parts": [{"text": "Summarize the benefits of an AI gateway."}]} ] }' ``` ### python ```python import requests api_key = "" model = "gemini-3-pro-preview" resp = requests.post( f"https://api.unigateway.ai/v1beta/models/{model}:generateContent", headers={ "Authorization": f"Bearer {api_key}", "Content-Type": "application/json", }, json={ "contents": [ {"parts": [{"text": "Summarize the benefits of an AI gateway."}]} ] }, ) print(resp.json()["candidates"][0]["content"]["parts"][0]["text"]) ``` ### typescript ```typescript const model = "gemini-3-pro-preview"; const resp = await fetch(`https://api.unigateway.ai/v1beta/models/${model}:generateContent`, { method: "POST", headers: { Authorization: `Bearer ${process.env.UNIGATEWAY_API_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ contents: [ { parts: [{ text: "Summarize the benefits of an AI gateway." }] }, ], }), }); const data = await resp.json(); console.log(data.candidates[0].content.parts[0].text); ``` ## Images # Images Overview > Category: Images | Last updated: 2026-05-15 Image generation overview comparing OpenAI Images API and Gemini Images API protocols. # Images Generate images with UniGateway through OpenAI-compatible or Gemini native protocols. | Protocol | Model | Endpoint | |---|---|---| | OpenAI Images API | `gpt-image-2` | `POST /v1/images/generations`, `/v1/images/edits` | | Gemini Images API | `gemini-3-pro-image-preview`, `gemini-3.1-flash-image-preview` | `POST /v1beta/models/{model}:generateContent` | - **[OpenAI Images API](./openai-images-api.md)** — text-to-image, editing, inpainting, streaming via `gpt-image-2` - **[Gemini Images API](./gemini-images-api.md)** — text-to-image and image-to-image with Nano Banana Pro and Nano Banana 2 Confirm model availability with `GET /v1/models` before production use. # Image Generation and Editing > Category: Images | Last updated: 2026-05-15 Generate and edit images with gpt-image-2 and Gemini image models through UniGateway. # Image Generation and Editing Use UniGateway to generate and edit images: `gpt-image-2` via OpenAI-compatible Images API, Nano Banana models via Gemini `generateContent`. ## Quick Reference | Task | Model ID | Endpoint | Content-Type | |---|---|---|---| | Text-to-image | `gpt-image-2` | `POST /v1/images/generations` | `application/json` | | Multi-image (n) | `gpt-image-2` | `POST /v1/images/generations` | `application/json` | | Streaming | `gpt-image-2` | `POST /v1/images/generations` + `stream:true` | `application/json` | | Edit / composite / inpaint | `gpt-image-2` | `POST /v1/images/edits` | `multipart/form-data` | | Gemini text-to-image | `gemini-3-pro-image-preview` or `gemini-3.1-flash-image-preview` | `POST /v1beta/models/{model}:generateContent` | `application/json` | | Gemini image-to-image | same as above, add `inline_data` in `parts[]` | same | `application/json` | **Base URL**: `https://api.unigateway.ai/v1` for Images API; `https://api.unigateway.ai` for Gemini. > Display names like Nano Banana Pro / Nano Banana 2 are product nicknames. The [model library](https://unigateway.ai/models) shows requestable model IDs. Always use the exact `id` from `GET /v1/models`. Authentication: `Authorization: Bearer $UNIGATEWAY_API_KEY` ## Prerequisites - A UniGateway API key stored in `UNIGATEWAY_API_KEY` - Confirm the target model is available via `GET /v1/models` ## gpt-image-2 ### Text-to-Image ```bash curl -sS -X POST "https://api.unigateway.ai/v1/images/generations" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-image-2", "prompt": "A clean product hero image for an AI gateway dashboard, dark background, soft blue lighting." }' > response.json ``` **Python** ```python from openai import OpenAI; import base64 client = OpenAI(api_key="", base_url="https://api.unigateway.ai/v1") result = client.images.generate(model="gpt-image-2", prompt="A clean product hero image.") with open("out.png", "wb") as f: f.write(base64.b64decode(result.data[0].b64_json)) ``` **TypeScript** ```typescript import OpenAI from "openai"; import fs from "fs"; const client = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1" }); const r = await client.images.generate({ model: "gpt-image-2", prompt: "A clean product hero image." }); fs.writeFileSync("out.png", Buffer.from(r.data[0].b64_json, "base64")); ``` ### Multiple Images ```bash curl -sS -X POST "https://api.unigateway.ai/v1/images/generations" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-image-2", "prompt": "Four icons: chat, image, video, search.", "n": 4, "size": "1024x1024", "quality": "medium", "output_format": "png" }' > batch.json ``` ### Parameters | Parameter | Required | Values | Notes | |---|---|---|---| | `model` | Yes | `gpt-image-2` | | | `prompt` | Yes | text | | | `size` | No | `1024x1024`, `1536x1024`, `2048x2048`, `auto`, etc. | W×H, multiples of 16, max 3840, ratio ≤ 3:1 | | `quality` | No | `low` / `medium` / `high` / `auto` | default `auto` | | `n` | No | 1–10 | | | `output_format` | No | `png` / `jpeg` / `webp` | default `png` | | `output_compression` | No | 0–100 | jpeg/webp only | | `background` | No | `opaque` / `auto` | | | `moderation` | No | `auto` / `low` | | | `stream` | No | `true` | SSE streaming | | `partial_images` | No | 0–3 | streaming intermediates | | `user` | No | string | end-user identifier | ### Streaming ```bash curl -sS -X POST "https://api.unigateway.ai/v1/images/generations" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-image-2", "prompt": "A winter landscape.", "stream": true, "partial_images": 2 }' ``` ### Save Response ```bash jq -r '.data[0].b64_json' response.json | base64 -D > output.png # macOS jq -r '.data[0].b64_json' response.json | base64 --decode > output.png # Linux ``` ### Edit / Composite / Inpaint All use `POST /v1/images/edits` with `multipart/form-data`. **Single image edit** — pass the original in `image[]`: ```bash curl -sS -X POST "https://api.unigateway.ai/v1/images/edits" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -F "model=gpt-image-2" -F "image[]=@room.png" \ -F "prompt=Change the sofa to cream white, keep everything else." \ -F "quality=high" -F "size=1024x1024" -F "output_format=png" > edit.json ``` **Multi-reference composition** — pass multiple `image[]`: ```bash curl -sS -X POST "https://api.unigateway.ai/v1/images/edits" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -F "model=gpt-image-2" \ -F "image[]=@item1.png" -F "image[]=@item2.png" -F "image[]=@item3.png" \ -F "prompt=Combine all items into a single product photo on white background." \ -F "quality=high" -F "output_format=png" > composite.json ``` **Masked inpainting** — pass `mask` alongside `image[]`: ```bash curl -sS -X POST "https://api.unigateway.ai/v1/images/edits" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -F "model=gpt-image-2" -F "mask=@mask.png" -F "image[]=@src.png" \ -F "prompt=Fill the masked area with a pink flamingo pool float." > inpaint.json ``` Mask must match source dimensions and format, ≤ 50 MB, include alpha channel. ### Response ```json { "created": 1710000000, "data": [{ "b64_json": "..." }] } ``` --- ## Nano Banana (Gemini Image Models) Both text-to-image and image-to-image use `POST /v1beta/models/{model}:generateContent`. Images are passed as `inline_data` inside `parts[]`. | Model | API Model ID | Best for | |---|---|---| | Nano Banana Pro | `gemini-3-pro-image-preview` | Highest quality, complex instructions, text rendering, 4K | | Nano Banana 2 | `gemini-3.1-flash-image-preview` | Speed, high volume, general-purpose | ### Text-to-Image ```bash curl -sS -X POST "https://api.unigateway.ai/v1beta/models/gemini-3-pro-image-preview:generateContent" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "contents": [{ "parts": [{ "text": "Your prompt here." }] }], "generationConfig": { "responseModalities": ["TEXT", "IMAGE"], "imageConfig": { "aspectRatio": "16:9", "imageSize": "4K" } } }' ``` Replace the model with `gemini-3.1-flash-image-preview` for Nano Banana 2. ### Image-to-Image (Edit / Transform) ```bash B64=$(base64 -i input.png 2>/dev/null || base64 -w0 input.png) curl -sS -X POST "https://api.unigateway.ai/v1beta/models/gemini-3-pro-image-preview:generateContent" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d "{ \"contents\": [{ \"parts\": [ {\"text\": \"Your editing instruction.\"}, {\"inline_data\": {\"mime_type\": \"image/png\", \"data\": \"${B64}\"}} ]}], \"generationConfig\": { \"responseModalities\": [\"TEXT\", \"IMAGE\"], \"imageConfig\": { \"aspectRatio\": \"16:9\", \"imageSize\": \"4K\" } } }" ``` For multiple input images, add more `inline_data` blocks before the text instruction. ### Image Size Control | Field | Values | Notes | |---|---|---| | `imageConfig.aspectRatio` | `1:1`, `4:3`, `3:4`, `16:9`, `9:16` | Nano Banana Pro also supports `2:3`, `3:2`, `4:5`, `5:4`, `21:9` | | `imageConfig.imageSize` | `1K`, `2K`, `4K` | Pro supports up to 4K; Flash supports up to 4K | ### Save Response ```bash jq -r 'first(..|objects|select(.inlineData?.data)|.inlineData.data)' result.json | base64 -D > out.png # macOS jq -r 'first(..|objects|select(.inlineData?.data)|.inlineData.data)' result.json | base64 --decode > out.png # Linux ``` ### Response ```json { "candidates": [{ "content": { "parts": [ { "text": "Here is the generated image." }, { "inlineData": { "mimeType": "image/png", "data": "..." } } ] } }] } ``` --- ## Tips - Use `gpt-image-2` for the Images API, SDK integration, and `multipart/form-data` editing workflows. - Use Nano Banana Pro for highest quality Gemini output; Nano Banana 2 for speed and volume. - Gemini models handle both text-to-image and image-to-image through the same endpoint. - Upload decoded images to your own object storage for production. ## Errors | Status | Cause | Resolution | |---|---|---| | `400` | Invalid parameters or format | Check `prompt`, `size`, `mask` | | `401` | Invalid API key | Verify `Authorization` header | | `404` | Model not available | Confirm via `GET /v1/models` | | `429` | Rate limit | Back off and retry | | `5xx` | Server error | Exponential backoff | # OpenAI Images API > Category: Images | Last updated: 2026-05-15 Generate and edit images with gpt-image-2 through the OpenAI-compatible Images API. # OpenAI Images API Generate and edit images with `gpt-image-2` through the OpenAI-compatible Images API. Base URL: `https://api.unigateway.ai/v1` ## Text-to-Image ```bash curl -sS -X POST "https://api.unigateway.ai/v1/images/generations" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-image-2", "prompt": "A clean product hero image for an AI gateway dashboard." }' > response.json ``` **Python** ```python from openai import OpenAI; import base64 client = OpenAI(api_key="", base_url="https://api.unigateway.ai/v1") r = client.images.generate(model="gpt-image-2", prompt="A hero image.") with open("out.png","wb") as f: f.write(base64.b64decode(r.data[0].b64_json)) ``` **TypeScript** ```typescript import OpenAI from "openai"; import fs from "fs"; const c = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1" }); const r = await c.images.generate({ model: "gpt-image-2", prompt: "A hero image." }); fs.writeFileSync("out.png", Buffer.from(r.data[0].b64_json, "base64")); ``` ## Multiple Images ```bash curl -sS -X POST "https://api.unigateway.ai/v1/images/generations" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{"model":"gpt-image-2","prompt":"Four icons: chat, image, video, search.","n":4,"size":"1024x1024","quality":"medium","output_format":"png"}' > batch.json ``` ## Parameters | Parameter | Required | Values | Notes | |---|---|---|---| | `model` | Yes | `gpt-image-2` | | | `prompt` | Yes | text | | | `size` | No | `1024x1024`, `1536x1024`, `2048x2048`, `auto`, etc. | W×H ×16, max 3840, ratio ≤ 3:1 | | `quality` | No | `low` / `medium` / `high` / `auto` | default `auto` | | `n` | No | 1–10 | | | `output_format` | No | `png` / `jpeg` / `webp` | default `png` | | `output_compression` | No | 0–100 | jpeg/webp only | | `background` | No | `opaque` / `auto` | | | `moderation` | No | `auto` / `low` | | | `stream` | No | `true` | SSE streaming | | `partial_images` | No | 0–3 | streaming intermediates | | `user` | No | string | end-user identifier | ## Streaming ```bash curl -sS -X POST "https://api.unigateway.ai/v1/images/generations" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{"model":"gpt-image-2","prompt":"A winter landscape.","stream":true,"partial_images":2}' ``` ## Save Response ```bash jq -r '.data[0].b64_json' response.json | base64 -D > output.png # macOS jq -r '.data[0].b64_json' response.json | base64 --decode > output.png # Linux ``` ## Edit / Composite / Inpaint All use `POST /v1/images/edits` with `multipart/form-data`. **Single image edit:** ```bash curl -sS -X POST "https://api.unigateway.ai/v1/images/edits" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -F "model=gpt-image-2" -F "image[]=@room.png" \ -F "prompt=Change the sofa to cream white, keep everything else." \ -F "quality=high" -F "size=1024x1024" -F "output_format=png" > edit.json ``` **Multi-reference composition:** ```bash curl -sS -X POST "https://api.unigateway.ai/v1/images/edits" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -F "model=gpt-image-2" \ -F "image[]=@item1.png" -F "image[]=@item2.png" -F "image[]=@item3.png" \ -F "prompt=Combine into a single product photo." \ -F "quality=high" -F "output_format=png" > composite.json ``` **Masked inpainting:** ```bash curl -sS -X POST "https://api.unigateway.ai/v1/images/edits" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -F "model=gpt-image-2" -F "mask=@mask.png" -F "image[]=@src.png" \ -F "prompt=Fill the masked area with a pink flamingo." > inpaint.json ``` Mask requirements: identical dimensions, ≤ 50 MB, alpha channel. ## Response ```json { "created": 1710000000, "data": [{ "b64_json": "..." }] } ``` ## Errors | Status | Cause | Resolution | |---|---|---| | `400` | Invalid parameters | Check `prompt`, `size`, `mask` | | `401` | Invalid API key | Verify `Authorization` | | `404` | Model not available | Confirm via `GET /v1/models` | | `429` | Rate limit | Back off and retry | | `5xx` | Server error | Exponential backoff | --- ## Code examples ### curl ```curl curl -sS -X POST "https://api.unigateway.ai/v1/images/generations" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{"model":"gpt-image-2","prompt":"A clean product hero image.","size":"1024x1024","quality":"high","output_format":"png"}' > image.json ``` ### python ```python from openai import OpenAI; import base64 client = OpenAI(api_key="", base_url="https://api.unigateway.ai/v1") r = client.images.generate(model="gpt-image-2", prompt="A hero image.") with open("out.png","wb") as f: f.write(base64.b64decode(r.data[0].b64_json)) ``` ### typescript ```typescript import OpenAI from "openai"; import fs from "fs"; const c = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1" }); const r = await c.images.generate({ model: "gpt-image-2", prompt: "A hero image." }); fs.writeFileSync("out.png", Buffer.from(r.data[0].b64_json, "base64")); ``` # Gemini Images API > Category: Images | Last updated: 2026-05-15 Generate and edit images with Nano Banana models through Gemini generateContent. # Gemini Images API Generate and edit images with Nano Banana models through Gemini `generateContent`. Base URL: `https://api.unigateway.ai` | Display name | API model ID | Best for | |---|---|---| | Nano Banana Pro | `gemini-3-pro-image-preview` | Highest quality, complex instructions, text rendering, 4K | | Nano Banana 2 | `gemini-3.1-flash-image-preview` | Speed, high volume, general-purpose | ## Text-to-Image ```bash curl -sS -X POST "https://api.unigateway.ai/v1beta/models/gemini-3-pro-image-preview:generateContent" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "contents": [{ "parts": [{ "text": "Your prompt here." }] }], "generationConfig": { "responseModalities": ["TEXT", "IMAGE"], "imageConfig": { "aspectRatio": "16:9", "imageSize": "4K" } } }' ``` Replace the model with `gemini-3.1-flash-image-preview` for Nano Banana 2. ## Image-to-Image Pass images as `inline_data` in `parts[]` alongside a text instruction: ```bash B64=$(base64 -i input.png 2>/dev/null || base64 -w0 input.png) curl -sS -X POST "https://api.unigateway.ai/v1beta/models/gemini-3-pro-image-preview:generateContent" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d "{ \"contents\": [{ \"parts\": [ {\"text\": \"Your editing instruction.\"}, {\"inline_data\": {\"mime_type\": \"image/png\", \"data\": \"${B64}\"}} ]}], \"generationConfig\": { \"responseModalities\": [\"TEXT\", \"IMAGE\"], \"imageConfig\": { \"aspectRatio\": \"16:9\", \"imageSize\": \"4K\" } } }" ``` For multiple input images, add more `inline_data` blocks before the text instruction. ## Image Size Control | Field | Values | Notes | |---|---|---| | `imageConfig.aspectRatio` | `1:1`, `4:3`, `3:4`, `16:9`, `9:16` | Pro also: `2:3`, `3:2`, `4:5`, `5:4`, `21:9` | | `imageConfig.imageSize` | `1K`, `2K`, `4K` | | ## Save Response ```bash jq -r 'first(..|objects|select(.inlineData?.data)|.inlineData.data)' result.json | base64 -D > out.png # macOS jq -r 'first(..|objects|select(.inlineData?.data)|.inlineData.data)' result.json | base64 --decode > out.png # Linux ``` ## Response ```json { "candidates": [{ "content": { "parts": [ { "text": "Here is the generated image." }, { "inlineData": { "mimeType": "image/png", "data": "..." } } ] } }] } ``` ## Errors | Status | Cause | Resolution | |---|---|---| | `400` | Invalid parameters | Check `imageConfig` | | `401` | Invalid API key | Verify `Authorization` | | `404` | Model not available | Confirm via `GET /v1/models` | | `429` | Rate limit | Back off and retry | | `5xx` | Server error | Exponential backoff | --- ## Code examples ### curl ```curl curl -sS -X POST "https://api.unigateway.ai/v1beta/models/gemini-3-pro-image-preview:generateContent" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{"contents":[{"parts":[{"text":"Your prompt."}]}],"generationConfig":{"responseModalities":["TEXT","IMAGE"],"imageConfig":{"aspectRatio":"16:9","imageSize":"4K"}}}' > image.json ``` ### python ```python import requests key = "" resp = requests.post("https://api.unigateway.ai/v1beta/models/gemini-3-pro-image-preview:generateContent", headers={"Authorization":f"Bearer {key}","Content-Type":"application/json"}, json={"contents":[{"parts":[{"text":"Your prompt."}]}],"generationConfig":{"responseModalities":["TEXT","IMAGE"],"imageConfig":{"aspectRatio":"16:9","imageSize":"4K"}}}) print(resp.json()) ``` ### typescript ```typescript const resp = await fetch("https://api.unigateway.ai/v1beta/models/gemini-3-pro-image-preview:generateContent", { method:"POST", headers:{ Authorization:`Bearer ${process.env.UNIGATEWAY_API_KEY}`, "Content-Type":"application/json" }, body:JSON.stringify({ contents:[{ parts:[{ text:"Your prompt." }] }], generationConfig:{ responseModalities:["TEXT","IMAGE"], imageConfig:{ aspectRatio:"16:9", imageSize:"4K" } } }) }); console.log(await resp.json()); ``` ## Video # Video Generation > Category: Video | Last updated: 2026-05-15 All video generation providers: Seedance and Sora, with protocol differences. # Video Generation UniGateway separates video models by provider and protocol. Each video family uses its own Base URL and request format. ## Providers and Protocols | Provider | Model family | Base URL | Protocol | Content-Type | Auth | |---|---|---|---|---|---| | ByteDance | Seedance | `https://video.unigateway.ai` | `/api/v3/contents/generations/tasks` | `application/json` | `Authorization: Bearer $UNIGATEWAY_API_KEY` | | OpenAI | Sora | `https://api.unigateway.ai/v1` | `/v1/videos` | `multipart/form-data` | `Authorization: Bearer $UNIGATEWAY_API_KEY` | ## Key Differences | Aspect | Seedance | Sora | |---|---|---| | Base URL | `https://video.unigateway.ai` | `https://api.unigateway.ai/v1` | | Content-Type | `application/json` | `multipart/form-data` | | Create task | `POST /api/v3/contents/generations/tasks` | `POST /v1/videos` | | Query status | `GET /api/v3/contents/generations/tasks/{id}` | `GET /v1/videos/{id}` | | List tasks | `GET /api/v3/contents/generations/tasks` | Not available | | Delete task | `DELETE /api/v3/contents/generations/tasks/{id}` | Not available | | Asset libraries | `/api/v3/asset-groups` and `/api/v3/assets` | Not available | | Model discovery | Not through `GET /v1/models` | `GET /v1/models` with `supported_endpoint_types: ["openai-video"]` | | Recommended model | `doubao-seedance-2.0-fast` | `sora-2` | Do not send Sora model IDs to Seedance endpoints or vice versa. The request formats and content types are incompatible. ## Common Workflow 1. Choose a provider 2. Submit a video generation request 3. Receive a task/job ID 4. Poll the status endpoint until the video is complete 5. Save the video URL - [ByteDance (Seedance)](./seedance-overview.md) - [OpenAI (Sora)](./sora-overview.md) # Sora Video Generation > Category: Video | Last updated: 2026-05-15 Generate videos from text prompts with Sora through the OpenAI-compatible video protocol. # Sora Video Generation Use Sora through UniGateway to generate videos from text prompts. Requests use `multipart/form-data`. ## Prerequisites - A UniGateway API key stored in `UNIGATEWAY_API_KEY` - Confirm Sora models are available via `GET /v1/models` ## Endpoints | Action | Method | Path | Content-Type | |---|---|---|---| | Create video | `POST` | `/v1/videos` | `multipart/form-data` | | Query status | `GET` | `/v1/videos/{id}` | — | | Download video | `GET` | `/v1/videos/{id}/content` | — | Base URL: `https://api.unigateway.ai/v1` ## Supported Models | Model ID | Description | |---|---| | `sora-2` | General-purpose video generation | | `sora-2-pro` | Higher-quality video generation | ## Create a Video ```bash curl -sS -X POST "https://api.unigateway.ai/v1/videos" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -F 'model=sora-2' \ -F 'prompt=A cinematic aerial shot of a coastline at golden hour.' \ -F 'size=1280x720' \ -F 'seconds=5' ``` Response: ```json { "id": "task_4kX5fFNCbHgMJLx", "object": "video", "status": "queued", "model": "sora-2", "progress": 0, "size": "1280x720", "seconds": "5" } ``` ## Query Video Status ```bash curl -sS "https://api.unigateway.ai/v1/videos/{id}" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` Response (in progress): ```json { "id": "task_4kX5fFNCbHgMJLx", "object": "video", "status": "in_progress", "model": "sora-2", "progress": 0, "size": "1280x720", "seconds": "5" } ``` Response (completed): ```json { "id": "task_4kX5fFNCbHgMJLx", "object": "video", "status": "completed", "model": "sora-2", "progress": 100, "size": "1280x720", "seconds": "5", "video_url": "https:///videos/output.mp4" } ``` ## Download Video ```bash curl -sS "https://api.unigateway.ai/v1/videos/{id}/content" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -o output.mp4 ``` ## Parameters (Create) | Parameter | Required | Description | |---|---|---| | `model` | Yes | Model ID: `sora-2` or `sora-2-pro` | | `prompt` | Yes | Text description of the desired video | | `size` | No | Output resolution, e.g. `1280x720` | | `seconds` | No | Target duration in seconds, e.g. `4`, `5` | ## Polling Strategy - First query: 5-10 seconds after creation - Increase interval as wait time grows - Set a total timeout on the caller side ## Sora vs Other Video Models | Video family | Base URL | Protocol | Content-Type | Model example | |---|---|---|---|---| | Sora | `https://api.unigateway.ai/v1` | OpenAI-compatible `/v1/videos` | `multipart/form-data` | `sora-2` | | Seedance | `https://video.unigateway.ai` | `/api/v3/contents/generations/tasks` | `application/json` | `doubao-seedance-2.0-fast` | ## Errors | Status | Cause | Resolution | |---|---|---| | `400` | Invalid parameters or unsupported size | Check `size` and `seconds` | | `401` | Invalid or missing API key | Verify `Authorization` header | | `404` | Model not found | Confirm via `GET /v1/models` | | `429` | Rate limit exceeded | Add backoff and retry | | `500` | Server or upstream error | Retry with exponential backoff | --- ## Code examples ### curl ```curl curl -sS -X POST "https://api.unigateway.ai/v1/videos" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "sora-2", "prompt": "A cinematic aerial shot of a coastline at golden hour.", "size": "1280x720", "seconds": "5" }' ``` ### python ```python import requests key = "" resp = requests.post( "https://api.unigateway.ai/v1/videos", headers={"Authorization": f"Bearer {key}", "Content-Type": "application/json"}, json={"model": "sora-2", "prompt": "A cinematic aerial shot of a coastline.", "size": "1280x720", "seconds": "5"} ) print(resp.json()) ``` ### typescript ```typescript const resp = await fetch("https://api.unigateway.ai/v1/videos", { method: "POST", headers: { Authorization: `Bearer ${process.env.UNIGATEWAY_API_KEY}`, "Content-Type": "application/json" }, body: JSON.stringify({ model: "sora-2", prompt: "A cinematic aerial shot of a coastline.", size: "1280x720", seconds: "5" }), }); console.log(await resp.json()); ``` # Seedance > Category: Video | Last updated: 2026-05-15 Overview of the Seedance workflow, interface surfaces, and child endpoint docs. # ByteDance Seedance / Overview Seedance is ByteDance's video generation model, called through an async task interface on the video domain. ## Prerequisites - A UniGateway API key stored in `UNIGATEWAY_API_KEY` ```bash UNIGATEWAY_API_KEY= ``` ## Authentication All Seedance endpoints use Bearer Token: ```http Authorization: Bearer $UNIGATEWAY_API_KEY ``` ## Endpoints ### Video Generation | Method | Path | Purpose | |---|---|---| | `POST` | `/api/v3/contents/generations/tasks` | Create video task | | `GET` | `/api/v3/contents/generations/tasks` | List tasks | | `GET` | `/api/v3/contents/generations/tasks/{id}` | Query single task | | `DELETE` | `/api/v3/contents/generations/tasks/{id}` | Cancel/delete task | ### Asset Libraries | Method | Path | Purpose | |---|---|---| | `POST` | `/api/v3/asset-groups` | Create asset group | | `GET` | `/api/v3/asset-groups` | List asset groups | | `GET` | `/api/v3/asset-groups/{groupId}` | Get asset group | | `PATCH` | `/api/v3/asset-groups/{groupId}` | Update asset group | | `POST` | `/api/v3/assets` | Create asset | | `GET` | `/api/v3/assets` | List assets | | `GET` | `/api/v3/assets/{assetId}` | Get asset | | `PATCH` | `/api/v3/assets/{assetId}` | Update asset | | `DELETE` | `/api/v3/assets/{assetId}` | Delete asset | ## Find Available Models Seedance models are on the video domain, not through `/v1/models`: ```bash curl https://video.unigateway.ai/api/models \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` Example models — verify with a live query: | Model ID | Description | |---|---| | `doubao-seedance-2.0-fast` | Fast Seedance video generation | | `doubao-seedance-2.0` | Standard Seedance video generation | ## Workflow ### Create Task ```bash curl -sS -X POST "https://video.unigateway.ai/api/v3/contents/generations/tasks" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "doubao-seedance-2.0-fast", "content": [ { "type": "text", "text": "A blue sky with one white cloud." } ], "ratio": "16:9", "duration": 5, "generate_audio": false }' ``` Response: ```json { "id": "cgt-20260514135903-68khw" } ``` ### Query Status ```bash curl -sS "https://video.unigateway.ai/api/v3/contents/generations/tasks/{id}" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` Response (in progress): ```json { "id": "cgt-20260514135903-68khw", "model": "doubao-seedance-2.0-fast", "status": "running", "content": {}, "usage": null } ``` Response (completed): ```json { "id": "cgt-20260514135903-68khw", "model": "doubao-seedance-2.0-fast", "status": "succeeded", "content": { "video_url": "https:///media/output/video.mp4" }, "usage": { "billing_mode": "credits", "credits": 50 } } ``` ### List Tasks ```bash curl -sS "https://video.unigateway.ai/api/v3/contents/generations/tasks?page_num=1&page_size=20" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` Returns tasks from the last 7 days. Optional filters: `filter.status`, `filter.task_ids`, `filter.model`. ### Cancel a Task ```bash curl -sS -X DELETE "https://video.unigateway.ai/api/v3/contents/generations/tasks/{id}" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` Queued tasks are cancelled. Running or completed tasks cannot be deleted. ## Task States | Status | Meaning | Action | |---|---|---| | `queued` | Accepted, not yet processing | Continue polling | | `running` | Generation in progress | Continue polling | | `succeeded` | Result ready | Save `content.video_url` | | `failed` | Processing failed | Check `error`; decide whether to resubmit | | `expired` | Task or result expired | Create a new task | | `cancelled` | Task terminated | Treat as terminal | ## Parameters (Create) | Field | Type | Required | Description | |---|---|---|---| | `model` | string | Yes | Seedance model ID from video domain | | `content` | array | Yes | At least one input item; `text` type is supported | | `ratio` | string | No | Aspect ratio, e.g. `16:9` | | `duration` | number | No | Target duration in seconds | | `generate_audio` | boolean | No | Whether to generate audio | Not all model variants support every optional field. Test each individually. ## Polling Strategy - First query: 2-3 seconds after creation - Increase interval as wait time grows (3s -> 5s -> 10s) - Set a total timeout on the caller side - Avoid high-frequency concurrent queries for the same task ID ## Common Failures | Status | Cause | Action | |---|---|---| | `400` | Invalid field or unsupported option | Remove optional fields; retry with minimal payload | | `401` / `403` | Invalid key or permissions | Check credentials | | `404` | Model not found | Verify model ID from video domain model list | | `429` | Rate limit | Back off and retry | | `5xx` | Gateway or upstream error | Retry with capped backoff | For detailed endpoint reference: [Create Task](./seedance-create-task.md), [Query Task](./seedance-task-query.md), [Asset Libraries](./seedance-asset-libraries.md). # Create Task > Category: Video | Last updated: 2026-05-15 Create asynchronous video generation tasks with the Seedance interface. # ByteDance Seedance / Create Task Create an asynchronous video generation task. ## Endpoint | Item | Value | |---|---| | Method | `POST` | | Path | `/api/v3/contents/generations/tasks` | | Base URL | `https://video.unigateway.ai` | | Auth | `Authorization: Bearer $UNIGATEWAY_API_KEY` | ## Request ```bash curl https://video.unigateway.ai/api/v3/contents/generations/tasks \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "doubao-seedance-2.0-fast", "content": [ { "type": "text", "text": "A cinematic tracking shot of a sports car driving through neon streets at night." } ], "ratio": "16:9", "duration": 5, "generate_audio": false }' ``` Response: ```json { "id": "cgt-20260514135903-68khw" } ``` A successful call returns a task `id`. Use this ID with [Query Task](./seedance-task-query.md) to get the result. ## Parameters ### Required | Field | Type | Description | |---|---|---| | `model` | string | Seedance model ID, e.g. `doubao-seedance-2.0-fast` | | `content` | array | At least one supported input item; `text` type is supported | ### Optional | Field | Type | Example | Description | |---|---|---|---| | `ratio` | string | `"16:9"` | Aspect ratio | | `duration` | number | `5` | Target output duration in seconds | | `generate_audio` | boolean | `false` | Whether to generate audio | | `service_tier` | string | `"standard"` | Service tier | Not all model variants support every optional field. Test each field individually. ## Validation | Field | Rule | |---|---| | `model` | Must be an exact Seedance-capable model ID from your account | | `content` | Must include at least one supported input item | | `ratio` / `duration` | Model-dependent; test each value individually | | `generate_audio` | Not all model variants support this | On failure, remove all optional fields and retry with the minimal payload. ## Common Failures | Status | Cause | Action | |---|---|---| | `400` | Invalid field shape or unsupported option | Remove optional fields; retry with minimal payload | | `401` / `403` | Invalid key or insufficient permissions | Check credentials and account access | | `429` | Rate limit or review limit | Back off and queue requests | | `5xx` | Gateway or upstream error | Retry with capped backoff | --- ## Code examples ### curl ```curl curl https://video.unigateway.ai/api/v3/contents/generations/tasks \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "doubao-seedance-2.0-fast", "content": [ {"type": "text", "text": "Cyberpunk downtown at night, fast tracking shot."} ], "ratio": "16:9", "duration": 5, "generate_audio": false }' ``` ### python ```python import requests api_key = "" base_url = "https://video.unigateway.ai" headers = { "Authorization": "Bearer " + api_key, "Content-Type": "application/json", } resp = requests.post( base_url + "/api/v3/contents/generations/tasks", headers=headers, json={ "model": "doubao-seedance-2.0-fast", "content": [{"type": "text", "text": "A cinematic city night drive."}], "ratio": "16:9", "duration": 5, }, ) resp.raise_for_status() print(resp.json()) ``` ### typescript ```typescript const baseURL = "https://video.unigateway.ai"; const resp = await fetch(`${baseURL}/api/v3/contents/generations/tasks`, { method: "POST", headers: { Authorization: `Bearer ${process.env.UNIGATEWAY_API_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ model: "doubao-seedance-2.0-fast", content: [{ type: "text", text: "A cinematic city night drive." }], ratio: "16:9", duration: 5, generate_audio: false, }), }); console.log(await resp.json()); ``` # Query Task > Category: Video | Last updated: 2026-05-15 Query asynchronous video generation task results with the Seedance interface. # ByteDance Seedance / Query Task Query a previously created video generation task. ## Endpoint | Item | Value | |---|---| | Method | `GET` | | Path | `/api/v3/contents/generations/tasks/{id}` | | Base URL | `https://video.unigateway.ai` | | Auth | `Authorization: Bearer $UNIGATEWAY_API_KEY` | ## Query Replace `{id}` with the task ID from the create call: ```bash curl "https://video.unigateway.ai/api/v3/contents/generations/tasks/cgt-20260514135903-68khw" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` ## Response (Success) ```json { "id": "cgt-20260514135903-68khw", "model": "doubao-seedance-2.0-fast", "status": "succeeded", "content": { "video_url": "https:///media/output/cgt-20260514135903-68khw.mp4" }, "usage": { "billing_mode": "credits", "credits": 50 } } ``` ## Response (Failure) ```json { "id": "cgt-20260514135903-68khw", "model": "doubao-seedance-2.0-fast", "status": "failed", "content": {}, "usage": null, "error": { "code": "UPSTREAM_ERROR", "message": "Upstream task failed" } } ``` ## Task States | Status | Meaning | Action | |---|---|---| | `queued` | Accepted, not yet processing | Continue polling | | `running` | Generation in progress | Continue polling | | `succeeded` | Result is ready | Save `content.video_url` | | `failed` | Processing failed | Check `error`; decide whether to resubmit | | `expired` | Task or result expired | Create a new task if needed | | `cancelled` | Task was terminated | Treat as terminal | | `approved_asset_required` | Asset review required | Handle per business workflow | | `content_adjustment_required` | Content must be adjusted | Handle per business workflow | `queued` and `running` are in-progress states. Before a terminal state, `content.video_url` and `usage` may be null. ## Polling Strategy - First query: wait 2-3 seconds after creation - Increase polling interval as wait time grows (3s -> 5s -> 10s -> 15s) - Set a total timeout on the caller side to prevent infinite polling - Avoid high-frequency concurrent queries for the same task ID ## Task List ```bash curl "https://video.unigateway.ai/api/v3/contents/generations/tasks?page_num=1&page_size=20" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` - Returns tasks from the last 7 days - `page_num` defaults to `1`, `page_size` defaults to `20` - Max values: both `500` - Optional filters: `filter.status`, `filter.task_ids` (repeated), `filter.model`, `filter.service_tier` - Supported status values: `queued`, `running`, `cancelled`, `succeeded`, `failed`, `expired`, `approved_asset_required`, `content_adjustment_required` Response: ```json { "total": 1, "items": [ { "id": "cgt-20260514135903-68khw", "status": "succeeded" } ] } ``` ## Delete Task ```bash curl -X DELETE "https://video.unigateway.ai/api/v3/contents/generations/tasks/cgt-20260514135903-68khw" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` - Queued tasks are cancelled and return `{}` - Running tasks cannot be deleted - Already-cancelled tasks cannot be deleted again - Completed tasks return `{}` on delete ## Common Failures | Status | Cause | Action | |---|---|---| | `404` | Task ID incorrect or wrong Base URL | Verify task ID and endpoint | | `429` | Polling too aggressively | Increase backoff; reduce concurrent queries | | `5xx` | Gateway or upstream error | Retry with capped backoff | # Asset Libraries > Category: Video | Last updated: 2026-05-15 Manage Seedance-compatible asset groups and assets for reusable video workflows. # ByteDance Seedance / Asset Libraries Manage reusable reference images and videos for Seedance workflows. Use asset libraries when you need material review, whitelisting, or character/scene reference management. For ordinary text-to-video or image-to-video scenarios without review requirements, the video generation API alone is sufficient. ## Endpoint | Item | Value | |---|---| | Base URL | `https://video.unigateway.ai` | | Auth | `Authorization: Bearer $UNIGATEWAY_API_KEY` | ## Available Endpoints | Operation | Method | Path | |---|---|---| | Create asset group | `POST` | `/api/v3/asset-groups` | | List asset groups | `GET` | `/api/v3/asset-groups` | | Get asset group | `GET` | `/api/v3/asset-groups/{groupId}` | | Update asset group | `PATCH` | `/api/v3/asset-groups/{groupId}` | | Create asset | `POST` | `/api/v3/assets` | | List assets | `GET` | `/api/v3/assets` | | Get asset | `GET` | `/api/v3/assets/{assetId}` | | Update asset | `PATCH` | `/api/v3/assets/{assetId}` | | Delete asset | `DELETE` | `/api/v3/assets/{assetId}` | Field names use PascalCase. ## Create Asset Group ```bash curl -X POST "https://video.unigateway.ai/api/v3/asset-groups" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "Name": "character-references", "Description": "Approved Seedance reusable references", "GroupType": "AIGC" }' ``` | Field | Type | Required | Description | |---|---|---|---| | `Name` | string | Yes | Group name | | `Description` | string | No | Group description | | `GroupType` | string | No | Defaults to `AIGC` | Response: ```json { "Id": "group-123" } ``` ## List Asset Groups ```bash curl "https://video.unigateway.ai/api/v3/asset-groups?page_num=1&page_size=50" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` - `page_num` defaults to `1`, `page_size` defaults to `50`, max `200` - Optional filters: `filter.name`, `filter.group_ids`, `filter.group_type` - `filter.group_type` currently only accepts `AIGC` - Optional `project_name` parameter - Response includes `Items`, `TotalCount`, `PageNumber`, `PageSize` ## Create Asset ```bash curl -X POST "https://video.unigateway.ai/api/v3/assets" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "Name": "hero-reference-01", "AssetType": "Image", "URL": "https:///assets/hero-reference-01.jpg", "GroupId": "group-123" }' ``` | Field | Type | Required | Description | |---|---|---|---| | `Name` | string | Yes | Asset display name | | `URL` | string | Yes | Source file URL | | `GroupId` | string | Yes | Asset group ID | | `AssetType` | string | Recommended | `Image`, `Video`, or `Audio` | Response: ```json { "Id": "asset-001" } ``` ## List Assets ```bash curl "https://video.unigateway.ai/api/v3/assets?page_num=1&page_size=20&filter.group_ids=group-123&filter.group_type=AIGC" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` - `filter.group_ids` required - `filter.group_type` required, currently must be `AIGC` - Optional: `filter.statuses`, `filter.name`, `project_name` - `sort_by`: `created_at`, `updated_at`, `group_id` - `sort_order`: `asc`, `desc` ## Update Asset ```bash curl -X PATCH "https://video.unigateway.ai/api/v3/assets/asset-001" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "Name": "hero-reference-01-updated" }' ``` Currently only updates `Name`. ## Delete Asset ```bash curl -X DELETE "https://video.unigateway.ai/api/v3/assets/asset-001" \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` Response: ```json { "Id": "asset-001" } ``` ## Notes - Asset groups and assets are isolated per account - Use the `Id` returned by create or list endpoints for subsequent operations - Optional `project_name` parameter supported on asset group and asset operations ## Embeddings # Embeddings > Category: Embeddings | Last updated: 2026-05-15 Generate vector embeddings for text inputs with supported embedding models. # Embeddings Generate vector embeddings for text inputs through UniGateway. ## Endpoint | Item | Value | |---|---| | Method | `POST` | | Path | `/v1/embeddings` | | Base URL | `https://api.unigateway.ai/v1` | | Auth | `Authorization: Bearer $UNIGATEWAY_API_KEY` | | Content-Type | `application/json` | ## Minimal Request ```bash curl https://api.unigateway.ai/v1/embeddings \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "text-embedding-3-small", "input": "An AI gateway unifies access to multiple model providers." }' ``` ## Python ```python from openai import OpenAI client = OpenAI( api_key="", base_url="https://api.unigateway.ai/v1", ) resp = client.embeddings.create( model="text-embedding-3-small", input=["An AI gateway unifies access to multiple model providers."], ) print(f"Dimensions: {len(resp.data[0].embedding)}") ``` ## TypeScript ```typescript import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1", }); const resp = await client.embeddings.create({ model: "text-embedding-3-small", input: ["An AI gateway unifies access to multiple model providers."], }); console.log(`Dimensions: ${resp.data[0].embedding.length}`); ``` ## Parameters | Field | Type | Required | Description | |---|---|---|---| | `model` | string | Yes | Embedding model ID from `GET /v1/models` | | `input` | string/array | Yes | Text to embed; string for single, array for batch | | `dimensions` | number | No | Requested output dimensions (model-dependent) | | `encoding_format` | string | No | `float` (default) or `base64` | ## Response ```json { "object": "list", "model": "text-embedding-3-small", "data": [ { "object": "embedding", "index": 0, "embedding": [0.0123, -0.0456, ...] } ], "usage": { "prompt_tokens": 12, "total_tokens": 12 } } ``` | Response field | Description | |---|---| | `data[].embedding` | Float array of embedding values | | `data[].index` | Position in the input batch | | `usage.prompt_tokens` | Input token count | ## Finding Embedding Models Use `GET /v1/models` and filter by `supported_endpoint_types` containing embeddings-related hints. Example model: `text-embedding-3-small`. Always verify the model exists and returns embeddings with a real request before production use. ## LangChain Integration ```python from langchain_openai import OpenAIEmbeddings embeddings = OpenAIEmbeddings( model="text-embedding-3-small", base_url="https://api.unigateway.ai/v1", api_key="", ) result = embeddings.embed_query("Test text") print(f"Dimensions: {len(result)}") ``` ## Common Failures | Status | Cause | Resolution | |---|---|---| | `400` | Invalid model or input | Check `model` ID and `input` type | | `401` | Invalid API key | Verify `Authorization` header | | `404` | Model not in plan | Confirm model via `GET /v1/models` | | `413` | Input too large | Reduce text length or batch size | | `429` | Rate limit exceeded | Add backoff and retry | --- ## Code examples ### curl ```curl curl https://api.unigateway.ai/v1/embeddings \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "text-embedding-3-small", "input": "An AI gateway unifies access to multiple model providers." }' ``` ### python ```python from openai import OpenAI client = OpenAI(api_key="", base_url="https://api.unigateway.ai/v1") resp = client.embeddings.create(model="text-embedding-3-small", input=["Test"]) print(len(resp.data[0].embedding)) ``` ### typescript ```typescript import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1" }); const resp = await client.embeddings.create({ model: "text-embedding-3-small", input: ["Test"] }); console.log(resp.data[0].embedding.length); ``` ## Audio # Audio > Category: Audio | Last updated: 2026-05-15 Transcribe audio and translate audio to English with verified OpenAI-compatible audio endpoints. # Audio Speech transcription and translation through OpenAI-compatible audio endpoints. This page covers the verified audio endpoints currently available in the production model catalog. ## Available Endpoints | Task | Method | Path | Model example | |---|---|---|---| | Transcribe audio to text | `POST` | `/v1/audio/transcriptions` | `whisper-1` | | Translate audio to English | `POST` | `/v1/audio/translations` | `whisper-1` | Confirm model availability with `GET /v1/models` before production use. ## Transcription Transcribe audio to text. | Item | Value | |---|---| | Method | `POST` | | Path | `/v1/audio/transcriptions` | | URL | `https://api.unigateway.ai/v1/audio/transcriptions` | | Auth | `Authorization: Bearer $UNIGATEWAY_API_KEY` | | Content-Type | `multipart/form-data` | ### Request ```bash curl https://api.unigateway.ai/v1/audio/transcriptions \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -F model="whisper-1" \ -F file=@/path/to/audio.mp3 ``` ### Response ```json { "text": "Hello, this is a test of the audio transcription service." } ``` ### Parameters | Field | Type | Required | Description | |---|---|---|---| | `model` | string | Yes | Model ID, e.g. `whisper-1` | | `file` | file | Yes | Audio file to transcribe | | `language` | string | No | Language code, e.g. `en`, `zh` | | `response_format` | string | No | Output format: `text`, `json`, `verbose_json`, `srt`, `vtt`, or `tsv` | | `temperature` | number | No | Sampling temperature, range `0` to `1` | ## Translation Translate audio directly to English. | Item | Value | |---|---| | Method | `POST` | | Path | `/v1/audio/translations` | | URL | `https://api.unigateway.ai/v1/audio/translations` | | Auth | `Authorization: Bearer $UNIGATEWAY_API_KEY` | | Content-Type | `multipart/form-data` | ### Request ```bash curl https://api.unigateway.ai/v1/audio/translations \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -F model="whisper-1" \ -F file=@/path/to/audio.mp3 ``` ### Response ```json { "text": "Hello, this is a test of the audio transcription service." } ``` ### Parameters | Field | Type | Required | Description | |---|---|---|---| | `model` | string | Yes | Model ID, e.g. `whisper-1` | | `file` | file | Yes | Audio file to translate | | `response_format` | string | No | Output format: `text`, `json`, `verbose_json`, `srt`, `vtt`, or `tsv` | | `temperature` | number | No | Sampling temperature, range `0` to `1` | ## Python ```python from openai import OpenAI client = OpenAI( api_key="", base_url="https://api.unigateway.ai/v1", ) with open("audio.mp3", "rb") as f: transcription = client.audio.transcriptions.create( model="whisper-1", file=f, ) print(transcription.text) with open("audio.mp3", "rb") as f: translation = client.audio.translations.create( model="whisper-1", file=f, ) print(translation.text) ``` ## TypeScript ```typescript import OpenAI from "openai"; import fs from "fs"; const client = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1", }); const transcription = await client.audio.transcriptions.create({ model: "whisper-1", file: fs.createReadStream("audio.mp3"), }); console.log(transcription.text); const translation = await client.audio.translations.create({ model: "whisper-1", file: fs.createReadStream("audio.mp3"), }); console.log(translation.text); ``` ## Common Failures | Status | Cause | Resolution | |---|---|---| | `400` | Invalid file, unsupported format, or invalid parameter | Retry with a short MP3, WAV, M4A, or WebM file | | `401` | Invalid or missing API key | Verify the `Authorization` header | | `404` | Model not available | Confirm `whisper-1` via `GET /v1/models` | | `413` | File too large | Compress or split the audio file | | `429` | Rate limit exceeded | Back off and retry | --- ## Code examples ### curl ```curl curl https://api.unigateway.ai/v1/audio/transcriptions \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -F model="whisper-1" \ -F file=@/path/to/audio.mp3 ``` ### python ```python from openai import OpenAI client = OpenAI(api_key="", base_url="https://api.unigateway.ai/v1") with open("audio.mp3", "rb") as f: resp = client.audio.transcriptions.create(model="whisper-1", file=f) print(resp.text) ``` ### typescript ```typescript import OpenAI from "openai"; import fs from "fs"; const client = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1" }); const resp = await client.audio.transcriptions.create({ model: "whisper-1", file: fs.createReadStream("audio.mp3") }); console.log(resp.text); ``` ## Models # Models > Category: Models | Last updated: 2026-05-15 List available models, inspect endpoint hints, and copy requestable API model IDs. # Models Query available models. This endpoint is the single source of truth for model IDs. - Method: `GET` - Path: `/v1/models` - URL: `https://api.unigateway.ai/v1/models` ## Query ```bash curl https://api.unigateway.ai/v1/models \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` ## Response ```json { "success": true, "object": "list", "data": [ { "id": "gpt-5.4", "object": "model", "created": 1760000000, "owned_by": "custom", "supported_endpoint_types": ["chat_completions"] }, { "id": "gemini-3-pro-image-preview", "object": "model", "created": 1760000000, "owned_by": "vertex-ai", "supported_endpoint_types": ["gemini"] } ] } ``` The `id` field is the model ID used in requests. Important fields: | Field | Description | |---|---| | `id` | Exact request model ID. Copy this value into API requests. | | `owned_by` | Provider or upstream family. | | `supported_endpoint_types` | Endpoint family hints, such as `chat_completions`, `gemini`, `images`, `embeddings`, or `rerank`. | Model availability is account-specific. Use the live response, not external screenshots or stale examples. ## Scenario-Based Selection | Scenario | Recommended family | Note | |---|---|---| | General conversation | GPT / Claude | Prefer stable, non-preview IDs | | Production traffic | GPT / Claude / Gemini mid-tier | Keep request shape conservative | | Low-latency | Faster variants in your account | Validate quality before full traffic | | Multilingual | GPT / Claude / Gemini | Re-test prompts after switching families | | Embeddings / rerank | Endpoint-specific models | Confirm endpoint support first | | Video generation | Separate video surfaces | Do not share models with chat | ## Fallback Chain Fetch the model list at startup or on a short cache interval. Pin IDs by use case and configure cross-family fallback. Example chain: 1. `gpt-5.4` 2. `claude-sonnet-4-6` 3. `gemini-3-pro-preview` Model availability changes over time. Verify capabilities with a real request, not just the model name. ## Display Names vs API IDs The model library can show product names or nicknames for readability. Always use the API `id` field for requests. Example: Nano Banana Pro is the product/display name, while `gemini-3-pro-image-preview` is the requestable API model ID. --- ## Code examples ### curl ```curl curl https://api.unigateway.ai/v1/models \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` ### python ```python from openai import OpenAI client = OpenAI( api_key="", base_url="https://api.unigateway.ai/v1", ) models = client.models.list() for item in models.data: print(item.id) ``` ### typescript ```typescript import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1", }); const models = await client.models.list(); for (const item of models.data) { console.log(item.id); } ``` ## Integrations # Error Handling and Retries > Category: Integrations | Last updated: 2026-05-15 Practical retry and idempotency guidance for production integrations. # Error Handling and Retries Practical retry guidance for reliable integrations. ## Principles 1. Separate validation errors from transient errors 2. Retry only when the failure is temporary 3. Cap retries per model, per task, per user action 4. Log every attempt for auditability ## HTTP Status Handling | Status | Meaning | Action | |---|---|---| | 400 | Invalid request | Fix payload, do not retry | | 401 | Invalid or missing auth | Fix key or header | | 402 | Insufficient balance | Trigger billing workflow, pause retries | | 403 | Forbidden | Check account permissions | | 404 | Resource missing | Verify endpoint path and model ID | | 429 | Rate limited | Retry with exponential backoff and jitter | | 5xx | Server error | Retry with capped backoff, then switch to fallback model | ## Backoff Policy | Parameter | Value | |---|---| | Initial delay | `300ms` | | Multiplier | `2x` | | Max delay | `8s` | | Max attempts per model | `3` | ## Endpoint-Specific Playbooks ### Chat / Responses - Retry `429` and `5xx` on same model with capped backoff - Switch to fallback model after retries exhausted - Treat streaming and non-streaming requests separately ### Streaming - A retry creates a new request trace after a partial stream has been consumed. - Broken mid-flight: start a new request trace - Allow one reconnect only if no tokens arrived ### Model Discovery - Do not retry `404` blindly — refresh `GET /v1/models` first - Cache model lists briefly, invalidate on catalog changes ### Async Tasks - Retry create calls conservatively to avoid duplicate jobs - Poll existing task IDs before re-submitting - Use caller-side correlation IDs for idempotency ## Retry Budgets | Endpoint | Retry budget | After exhausted | |---|---|---| | `chat.completions` / `responses` | Up to 3 per model | Switch to next fallback | | Streaming chat | 1 reconnect (if no tokens arrived) | New request trace | | `GET /v1/models` | Short burst | Surface degraded state | | Async task create | 1–2 max | Query before creating another | | Async task poll | Many polls, low frequency | Stop at your timeout | ## Idempotency | Scenario | Strategy | |---|---| | Non-stream text | Safe to retry if previous attempt failed before full response | | Streaming | Partial stream = consumed output; retry creates new trace | | Async / stateful | Use caller-side idempotency keys | ## Escalation - `401`, `403`, `402` rates rising — retries will not fix account state - One family fails while another is healthy — routing or upstream-specific issue - Fallback usage spikes unexpectedly — system may appear healthy while cost drifts ## Observability Log per attempt: - Timestamp, model ID, endpoint path, status code - Retry count, latency, upstream error message - Fallback position, streaming flag, correlation ID, final outcome --- ## Code examples ### curl ```curl curl https://api.unigateway.ai/v1/chat/completions \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{"model":"gpt-5.4","messages":[{"role":"user","content":"retry demo"}]}' ``` ### python ```python from openai import OpenAI client = OpenAI(api_key="", base_url="https://api.unigateway.ai/v1") # implement retry with exponential backoff for 429/5xx ``` ### typescript ```typescript import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1" }); // retry 429/5xx with exponential backoff ``` # OpenAI SDK > Category: Integrations | Last updated: 2026-05-15 Connect OpenAI-compatible SDKs to UniGateway with a custom base URL and live model discovery. # OpenAI SDK Use UniGateway with the official OpenAI SDK by overriding base URL and API key. ## Prerequisites - OpenAI SDK installed - UniGateway API key ## Install ```bash # Python pip install openai # TypeScript npm install openai ``` For chat completions examples with the OpenAI SDK, see [Quickstart](./quickstart.md). The key change is overriding `base_url` / `baseURL` to `https://api.unigateway.ai/v1`. Fetch `GET /v1/models` before choosing a model ID. Start with `chat.completions`. Enable streaming or tools after basic non-streaming requests work. --- ## Code examples ### python ```python from openai import OpenAI client = OpenAI(api_key="", base_url="https://api.unigateway.ai/v1") print(client.models.list()) ``` ### typescript ```typescript import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1" }); console.log(await client.models.list()); ``` # Dify > Category: Integrations | Last updated: 2026-05-15 Use UniGateway as an OpenAI-compatible upstream in Dify. # Dify Add UniGateway as an OpenAI-compatible model provider in Dify. ## Prerequisites - Dify instance running - UniGateway API key ## Configuration In Dify admin panel, go to **Settings > Model Providers** and add an OpenAI-compatible provider: | Field | Value | |---|---| | Base URL | `https://api.unigateway.ai/v1` | | API Key | Your UniGateway key | | Model Name | Exact ID from `GET /v1/models` | > A duplicate `/v1` in the base URL will cause requests to fail. Requests hitting `/v1/v1/...` will fail. ## Verify Send a short test prompt to confirm the connection works. ```bash curl https://api.unigateway.ai/v1/models \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` ## Capability Mapping | Workload | Recommendation | |---|---| | Chat apps | Standard chat-completions models | | Automation workflows | Models verified for tool use | | Embeddings | `text-embedding-3-small` or equivalent | | Image / audio | Validate separately | ## Common Failures | Problem | Resolution | |---|---| | Provider connects but inference fails | Check exact model ID from `GET /v1/models` | | Requests hit `/v1/v1/...` | Remove duplicate `/v1` from base URL | | Features work inconsistently | Treat each endpoint family as a separate rollout | # OpenWebUI > Category: Integrations | Last updated: 2026-05-15 Use UniGateway as an OpenAI-compatible backend in OpenWebUI. # OpenWebUI Add UniGateway as an OpenAI-compatible connection in OpenWebUI. ## Prerequisites - OpenWebUI instance running - UniGateway API key ## Configuration Create one OpenAI-compatible connection: | Field | Value | |---|---| | Endpoint Base | `https://api.unigateway.ai/v1` | | API Key | Your UniGateway bearer token | | Model IDs | From `GET /v1/models` | ## Verify Send a short prompt to confirm models and chat work. ## Notes | Area | Recommendation | |---|---| | Model refresh | Re-sync after catalog changes | | Shared deployments | Use separate test and production keys | | Advanced features | Verify tools, files, multimodal routes one by one | ## Common Failures | Problem | Resolution | |---|---| | Connection saves but no models appear | Check API key scope and confirm `/v1/models` works | | Chat works, streaming inconsistent | Test streaming as a separate compatibility gate | | Workspaces behave differently | Check for cached config or duplicate connections | # Coding Tools and Agents > Category: Integrations | Last updated: 2026-05-15 Use UniGateway with coding assistants and agent tools that support OpenAI-compatible base URLs. # Coding Tools and Agents Coding assistants, CLI tools, and automation workflows can use UniGateway if they support an OpenAI-compatible base URL override. ## Requirements The tool must let you configure: - API key - Base URL - Model ID If a tool only supports vendor-native endpoints with no base URL override, UniGateway cannot be integrated directly. ## Configuration ```bash export OPENAI_API_KEY="$UNIGATEWAY_API_KEY" export OPENAI_API_BASE="https://api.unigateway.ai/v1" ``` Choose a model ID from `GET /v1/models`. ## Validation Flow 1. Send one non-stream request to confirm basic connectivity 2. Test tool calling and multi-step workflows after plain completion works 3. Check how the tool handles retries, streaming, and partial failures 4. Keep a fallback model ready for interactive coding traffic ## Good Fit vs Poor Fit | Good fit | Poor fit | |---|---| | Tools that support OpenAI-compatible chat completions | Tools that hardcode one vendor endpoint | | Systems with configurable environment variables | Systems requiring vendor-native auth only | | Applications with explicit model routing config | Desktop apps with no custom base URL support | ## Notes - Interactive coding traffic is latency-sensitive - Re-test tool use and structured outputs when switching model families - Keep test and production credentials separate # LobeChat > Category: Integrations | Last updated: 2026-05-15 Use UniGateway in LobeChat through the OpenAI-compatible provider path. # LobeChat Configure UniGateway as an OpenAI-compatible provider in LobeChat. ## Prerequisites - LobeChat instance running - UniGateway API key - Model ID from `GET /v1/models` ## Configuration In LobeChat model provider settings: - **Provider**: OpenAI-compatible path - **API key**: your UniGateway key - **Base URL**: `https://api.unigateway.ai/v1` - **Model IDs**: exact values from `GET /v1/models` Confirm whether your deployment expects the `/v1` suffix in the API endpoint URL. After normal chat works, enable tools, files, or multimodal features separately. ## Common Failures | Problem | Resolution | |---|---| | Chat loads but empty output | Re-check proxy/base URL, especially `/v1` suffix | | Models do not appear | Verify API key and confirm `GET /v1/models` works | | One model works, another does not | Model availability is account-specific | | Feature parity inconsistent | Validate each endpoint family separately | # n8n > Category: Integrations | Last updated: 2026-05-15 Use UniGateway in n8n through OpenAI-compatible nodes or the HTTP Request node. # n8n Use UniGateway through OpenAI-compatible nodes or the HTTP Request node. ## Prerequisites - n8n instance running - UniGateway API key - Model ID from `GET /v1/models` ## Integration Options | Path | When | |---|---| | OpenAI / Chat OpenAI nodes | Your n8n version supports the settings you need | | HTTP Request node | You need full control over URL, headers, and payload | ## Configuration Start with one non-streaming chat request to confirm the connection. After the first flow works, add embeddings, responses, or image/audio routes. > Keep test and production environment keys separate in n8n credentials. ## Notes | Area | Guidance | |---|---| | Credentials | Separate keys for test and production | | OpenAI node versions | Re-check behavior after n8n upgrades | | HTTP Request fallback | Use when built-in node lacks endpoint behavior | | Error handling | Route `429` and `5xx` to workflow-level backoff | ## Common Failures | Problem | Resolution | |---|---| | Node authenticates but requests fail | Verify endpoint URL, model ID, request family | | Built-in node lacks capability | Use HTTP Request node for that endpoint | | Works in one workflow, not another | Check cached credentials or hardcoded model IDs | # LangChain > Category: Integrations | Last updated: 2026-05-15 Use UniGateway with LangChain through the OpenAI-compatible langchain-openai package. # LangChain Use UniGateway through `langchain-openai`. ## Prerequisites - Package: `langchain-openai` - UniGateway API key - Model ID from `GET /v1/models` ## Install ```bash pip install langchain-openai ``` ## Configure ```python from langchain_openai import ChatOpenAI llm = ChatOpenAI( model="gpt-5.4", api_key="", base_url="https://api.unigateway.ai/v1", temperature=0, ) print(llm.invoke("Give me a short deployment checklist.").content) ``` Or via environment variables: ```bash export OPENAI_API_KEY="$UNIGATEWAY_API_KEY" export OPENAI_API_BASE="https://api.unigateway.ai/v1" ``` ## Common Failures | Problem | Resolution | |---|---| | Requests hit wrong host | Re-check `base_url` / `OPENAI_API_BASE` | | One prompt works, another unstable | Test with simpler request shape first | | Multi-step workflows amplify costs | Add per-run budgets and fallback limits | | Structured outputs differ across models | Validate schema-sensitive flows per model | # Cherry Studio > Category: Integrations | Last updated: 2026-05-15 Use UniGateway in Cherry Studio through a custom OpenAI-compatible provider configuration. # Cherry Studio Configure UniGateway through a custom OpenAI-type provider. ## Prerequisites - Cherry Studio installed - UniGateway API key ## Configuration In **Settings > Model Services**: | Step | Action | |---|---| | 1 | Add a new provider, type **OpenAI** | | 2 | **API Key**: your UniGateway key | | 3 | **API Address**: `https://api.unigateway.ai/v1` | | 4 | Add model IDs from `GET /v1/models` | | 5 | Test one normal chat | > Cherry Studio's built-in key check is only a first pass. Run one real request. ## Verify ```bash curl https://api.unigateway.ai/v1/models \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` ## Common Failures | Problem | Resolution | |---|---| | Provider saves but requests fail | Check API address includes `/v1` | | Added model does not work | Verify exact model ID from `GET /v1/models` | | UI shows model but runtime fails | UI list shows configured state, not live availability | # Flowise > Category: Integrations | Last updated: 2026-05-15 Use UniGateway in Flowise through ChatOpenAI with a custom base URL. # Flowise Configure UniGateway through the ChatOpenAI node with a custom base URL. ## Prerequisites - Flowise instance running - UniGateway API key - Model ID from `GET /v1/models` ## Configuration 1. Drag a `ChatOpenAI` node into your flow 2. Create a credential with your UniGateway key 3. Open **Additional Parameters** and set base path to `https://api.unigateway.ai/v1` 4. Set model name to an exact ID from `GET /v1/models` 5. Test one non-streaming chat before adding images, tools, or custom models If the standard ChatOpenAI node does not expose your model ID, use `ChatOpenAI Custom`. ## Common Failures | Problem | Resolution | |---|---| | Credential works but inference fails | Re-check base path and model ID | | Standard node cannot select model | Switch to `ChatOpenAI Custom` | | Text works, image upload fails | Validate multimodal support separately | # Continue > Category: Integrations | Last updated: 2026-05-15 Use UniGateway in Continue through the OpenAI provider with a custom apiBase. # Continue Configure UniGateway through the `openai` provider with a custom `apiBase`. ## Prerequisites - Continue extension installed - UniGateway API key - Model ID from `GET /v1/models` ## Configuration ```yaml name: UniGateway version: 0.0.1 schema: v1 models: - name: UniGateway Chat provider: openai model: gpt-5.2 apiBase: https://api.unigateway.ai/v1 apiKey: ``` Keep the `/v1` suffix in `apiBase`. ## Validate Send a simple message to confirm the connection. After chat works, enable edit, autocomplete, or other roles as needed. ## Common Failures | Problem | Resolution | |---|---| | Config loads but model fails | Re-check `apiBase`, `apiKey`, and model ID | | Autocomplete worse than chat | Treat each role as a separate rollout | | Streaming or reasoning differs | Verify model behavior with Continue's request mode | # Cline > Category: Integrations | Last updated: 2026-05-15 Use UniGateway in Cline through the OpenAI-compatible provider path. # Cline Configure UniGateway as an OpenAI-compatible provider in Cline. ## Prerequisites - Cline extension installed - UniGateway API key ## Configuration In Cline settings, set: | Field | Value | |---|---| | Base URL | `https://api.unigateway.ai/v1` | | API Key | Your UniGateway key | | Model ID | Exact value from `GET /v1/models` | > Base URL must point to UniGateway, not the official OpenAI endpoint. ## Verify Send a basic coding question to confirm the connection works. ## Notes | Area | Guidance | |---|---| | Base URL | Use UniGateway `/v1` path | | Model IDs | Re-check after catalog changes | | Tool-heavy tasks | Validate tool use separately from plain chat | ## Common Failures | Problem | Resolution | |---|---| | Provider saves but no model works | Confirm base URL, API key, and model ID with `/v1/models` | | One task succeeds, another unstable | Validate tool use and long-context separately | | Latency spikes in multi-step workflows | Prepare a faster fallback model | ## Observability & Billing # Request Logs > Category: Observability & Billing | Last updated: 2026-05-15 Monitor and analyze all API call records — token usage, cost, performance, routing decisions, and request tracing. # Request Logs UniGateway provides a comprehensive logging system to help you monitor and analyze all API call records in real time. With request logs, you can review detailed information for each request — including token usage, cost, performance metrics, and more — so you can better optimize your application and control costs. ## Viewing Logs ### Console Logs Page Visit **Logs** in the UniGateway console to view detailed records for all API calls. **Filters:** | Filter | Description | |---|---| | Time range | Select a specific date range to view historical records | | API key | Filter logs by different API keys for multi-project management | | Request ID | Enter a request ID to quickly locate a specific request | | Provider | Filter by provider (e.g., Anthropic, OpenAI, Google) | | Model | Filter by model to find call records for a specific model | | Finish reason | Filter by completion status (e.g., `stop`, `end_turn`, `max_tokens`) | ### Log List Fields | Field | Description | |---|---| | Timestamp | When the request was initiated | | Model | Model name used (e.g., `gpt-5.4`, `claude-sonnet-4-6`) | | Input Tokens | Number of input tokens; click to view detailed token breakdown | | Output Tokens | Number of output tokens | | Cost | Cost of this call (USD) | | Latency | Request latency (ms) | | Throughput | Tokens generated per second (tokens/s) | | Finish | Completion status (e.g., `end_turn`, `tool_use`, `stop`, `max_tokens`, `length`) | ### Token Details Click the number in the Input Tokens column to view a detailed token breakdown: | Token Type | Description | |---|---| | `prompt` | Base input tokens | | `input_cache_read` | Tokens read from cache | | `input_cache_write` | Tokens written to cache | | `input_cache_write_5_min` | 5-minute cache write tokens | | `input_cache_write_1_h` | 1-hour cache write tokens | Use this breakdown to understand cache utilization and optimize your caching strategy. ### Billing Details Hover over the Cost column to view billing details for a specific call: **Pay As You Go:** | Field | Description | |---|---| | Purchased Credits | Funds topped up by the user; used after Reward Credits are exhausted | | Reward Credits | Credits such as top-up bonuses; deducted first | | Status | Settlement status (e.g., `Settled`) | ## Request Details Click **Details** on any log entry to view the full information for that call. The details page is split into two sections. ### Conversation Content (Left Panel) The left panel displays the complete request and response content: | Section | Includes | |---|---| | User message | The input sent by the user | | System message | System prompt (if any) | | Assistant message | The response generated by the model | | Tool calls | Tool inputs and outputs (if tool calling is used) | **Display modes:** | Mode | Best For | |---|---| | Pretty | Reviewing conversation quality and interaction flow | | JSON | Debugging API integration or troubleshooting technical issues | In JSON mode, switch between data sources to inspect request/response details at different stages: | Source | Description | |---|---| | User → UniGateway | The original request sent by the user | | UniGateway → Origin | The request UniGateway forwarded to the upstream provider | | Origin → UniGateway | The original response from the upstream provider | | UniGateway → User | The response returned by UniGateway to the user | ### Metrics and Metadata (Right Panel) The right panel shows detailed technical metrics and metadata. **Model Information:** | Field | Description | |---|---| | Model | Model name used | | Provider | Model provider | **Performance Metrics:** | Metric | Description | |---|---| | First Token Latency (ms) | Time from sending the request to receiving the first token | | Generation Time (ms) | Time to generate the full response | | Throughput (tps) | Token generation rate (tokens per second) | **Raw Metadata:** View the full request metadata in JSON format with one-click copy support. ## Using the `X-UniGateway-RequestId` Every API response includes an `X-UniGateway-RequestId` header. Use this ID to: 1. **Search logs** — Enter the request ID in the Logs page filter to find a specific call 2. **Debug errors** — Provide this ID to support for request tracing 3. **Correlate with your own logs** — Store this ID alongside your application logs for end-to-end tracing ```python from openai import OpenAI client = OpenAI( api_key="", base_url="https://api.unigateway.ai/v1", ) response = client.chat.completions.create( model="gpt-5.4", messages=[{"role": "user", "content": "Hello"}], ) request_id = response._request_id print(f"Request ID: {request_id}") ``` ## Log Retention | Plan | Retention Period | |---|---| | Pay As You Go | 30 days | > Logs older than the retention period are automatically deleted. Export important logs before they expire. ## Best Practices 1. **Monitor model routing** — Check if the `model` field in logs differs from your requested model 2. **Identify cost outliers** — Sort by cost to find unexpectedly expensive calls 3. **Identify cost outliers** — Sort by cost to find unexpectedly expensive calls 4. **Analyze latency patterns** — Filter by high latency to identify performance bottlenecks 5. **Audit API key usage** — Filter by API key to ensure keys are used only for their intended purpose ## FAQ **Q: How long does it take for logs to appear?** A: Logs are typically available within a few seconds of request completion. During high-traffic periods, there may be a short delay. **Q: Can I export logs?** A: Yes. Use the export button on the Logs page to download logs as CSV for the selected time range. **Q: Can I access logs programmatically?** A: You can retrieve generation details via the Platform Management API using the `X-UniGateway-RequestId`. **Q: Why do some logs show a different model than the one I requested?** A: This happens when the platform selected a different model. The `model` field in the log shows the actual model used. # Cost Analytics > Category: Observability & Billing | Last updated: 2026-05-15 Monitor and analyze API spending by model, API key, provider, and billing item type with optimization strategies. # Cost Analytics UniGateway provides cost analysis capabilities to help you monitor and analyze API usage expenses in real time. Through the cost analytics page, you can view detailed spending across different dimensions — including costs categorized by model, API key, service provider, and more — enabling you to optimize your usage strategy and control expenses. ## Cost Overview At the top of the cost analytics page, you'll find key cost metric cards that provide a quick overview of overall spending for the current time period. **Filter options:** | Filter | Description | |---|---| | Time granularity | View data by Month, Day, or Hour | | Time range | Defaults to the most recent period in UTC; manually select specific dates or ranges | | API keys | Filter costs generated by specific API keys | | Model | Analyze all models or specify a particular model (e.g., `gpt-5.4`) | ### Key Metrics | Metric | Description | |---|---| | Total Cost | Total spending amount, covering all input, output, and other related costs | | Input Cost | Costs generated by input tokens | | Output Cost | Costs generated by output tokens | | Other Cost | Additional non-token-related costs (e.g., latency compensation, system resource consumption) | | Average Cost Per Request | Average cost per request | | Average Cost Per Million Tokens | Average cost per million tokens | ## Multi-Dimensional Analysis ### Cost by Model View consumption distribution across models in chart form, helping identify which models are the primary cost drivers. > Tip: Review this section regularly to adjust usage frequency for high-cost, low-efficiency models. ### Cost Breakdown by Type View cost allocation across different types: | Type | Description | |---|---| | Input tokens | Cost from processing input prompts | | Output tokens | Cost from model-generated output | | Other tokens | Additional costs (e.g., web search, image processing) | ### Cost by API Key View the consumption amount for each API key, facilitating cost attribution in team scenarios with multiple accounts or projects. ### Cost by Provider Switch to the **Provider** tab to view detailed cost breakdowns for each service provider: | View | Description | |---|---| | Cost by Provider | Total spending for different providers (e.g., OpenAI, Anthropic, Google) | | Provider Details | Call details for all models under that provider — model name, call count, input/output token counts, cost per call, and total cost | ## Cost Optimization Strategies ### 1. Use Prompt Caching For repeated prompts with consistent prefixes, prompt caching can significantly reduce input token costs: | Strategy | Savings | |---|---| | Use consistent system prompts | Cache reads are cheaper than full prompt processing | | Keep conversation prefixes stable | Cache persists across related requests | | Choose the right cache TTL | 5-min for short sessions, 1-hour for long-running workflows | ### 2. Set Budget Alerts Configure spending alerts in the console to receive notifications when costs exceed thresholds. ### 3. Monitor High-Cost Patterns Regularly review the cost by model and cost by API key views to identify unexpected spending patterns early. ## FAQ **Q: How often is cost data updated?** A: Cost data is updated in near real-time. There may be a brief delay during high-traffic periods. **Q: Can I set spending limits?** A: Yes. Configure budget alerts in the console to receive notifications when spending reaches specified thresholds. **Q: How do I understand per-call billing details?** A: View the billing details popover in [Request Logs](./request-logs.md) by hovering over the Cost column, or review the detailed rate breakdown for any individual call. # Usage Analytics > Category: Observability & Billing | Last updated: 2026-05-15 Track token consumption, request volume, provider performance, and model efficiency for optimization. # Usage Analytics UniGateway provides comprehensive usage analytics to help you monitor and analyze API calls, service provider performance, and model efficiency in real time. Through usage analytics, you can gain deep insights into key metrics such as token consumption, API request volume, and response times — enabling you to optimize application performance and control costs. ## Usage Overview In the **Usage** tab, you can view overall resource consumption, including token usage and API request counts. **Filter options:** | Filter | Description | |---|---| | Time range | Filter data by granularity — Month, Week, or Day | | API keys | Filter by specific API keys (All Keys or individual keys) | | Model | Select All Models or specify a particular model for analysis | ### Key Metrics | Metric | Description | |---|---| | Total Token Usage | Total token usage across all models (input + output) | | Input Token Usage | Total number of input tokens across all requests | | Output Token Usage | Total number of output tokens across all responses | | Total API Requests | Total number of API calls within the specified time period | ## Multi-Dimensional Analysis ### Usage by Model View token usage distribution across different models in charts or tables, helping identify high-consumption models. Use this view to: - Identify which models consume the most tokens - Compare input vs. output token ratios per model - Evaluate whether high-consumption models are delivering proportional value ### Usage by Token Type Separately track input and output token usage, making it easier to evaluate the cost structure of requests and responses. | Token Type | Description | |---|---| | Input tokens | Total tokens sent in prompts (including system, user, and assistant messages) | | Output tokens | Total tokens in model-generated responses | | Cache read tokens | Tokens served from cache (lower cost) | | Cache write tokens | Tokens written to cache for future reuse | ### Usage by API Key View token and request usage across different API keys, suitable for usage isolation and auditing in multi-user or multi-project scenarios. ### Web Search Usage View token consumption and call counts for requests with web search enabled, helping assess the frequency and cost of enhanced retrieval features. ## Provider Analytics Switch to the **Provider** tab to view the performance of different AI service providers. ### Key Metrics | Metric | Description | |---|---| | Primary Provider | The primary AI service provider currently in use | | Provider Count | Total number of service providers used | | Average Success Rate | Average success rate across all requests, reflecting service reliability | | Fastest Response Provider | The service provider with the shortest response time | ### Analysis Dimensions | Dimension | Description | |---|---| | Token Distribution by Provider | Token usage distribution across providers, evaluating resource allocation efficiency | | Request Distribution by Provider | API request count distribution by provider, reflecting the call load for each provider | ## Performance Analytics In the **Performance** tab, view performance metrics for API calls to evaluate model response efficiency and service quality. ### Key Metrics | Metric | Description | |---|---| | Average Latency | Average response latency in milliseconds; lower values indicate faster responses | | Average Throughput | Average throughput in tokens per second, reflecting processing capacity per unit time | | Fastest / Slowest Model | Fastest and slowest model response records, helping identify performance bottlenecks | | Highest / Lowest Throughput | Models with the highest and lowest throughput, assisting with load balancing optimization | ### First Token Latency by Model View the latency for generating the first token across different models. > First token latency is a critical user experience metric; lower values indicate more responsive performance. This is especially important for streaming applications. ### Throughput by Model View throughput (tokens per second) across different models. Higher values indicate better performance. Use this view to: - Compare processing efficiency across models within a given timeframe - Select high-throughput models to improve overall system responsiveness - Identify models with unexpectedly low throughput for investigation ## Using Analytics for Optimization ### Reduce Latency 1. Identify high-latency models in the **Performance** tab 2. Consider switching to faster model variants (e.g., `gpt-5.4-nano` instead of `gpt-5.4`) 3. Enable streaming for interactive applications — see [Streaming](../streaming.md) ### Improve Throughput 1. Identify low-throughput models in the Performance tab 2. Reduce prompt length where possible to decrease processing time ### Balance Cost and Performance 1. Use the Usage tab to identify high-consumption models 2. Cross-reference with the Cost tab to understand spending efficiency ## FAQ **Q: How often is usage data updated?** A: Usage data is updated in near real-time. There may be a brief delay during high-traffic periods. **Q: Can I export usage data?** A: Yes. Use the export button on the Usage Analytics page to download data as CSV for the selected time range. **Q: How do I track per-API-key usage for team billing?** A: Create separate API keys for each team or project, then filter by API key in the usage analytics page. See [Account & API Keys](../account-and-api-keys.md) for key management best practices. # Model Pricing > Category: Observability & Billing | Last updated: 2026-05-15 Transparent billing system — pricing per model and provider, billing items, cost optimization strategies, and FAQ. # Model Pricing UniGateway uses a transparent billing system to ensure every call is precisely metered and billed. Pricing differs across models, and the same model may be priced differently across providers. ## Viewing Prices ### Model Detail Pages View pricing for each provider on the model detail page in the UniGateway console. Every provider presents detailed billing standards, including costs for input tokens, output tokens, and special features. For models with tiered pricing, rates are displayed by usage tiers to help you understand costs at different consumption levels. ### Models API Retrieve pricing information programmatically via the Models API: ```bash curl https://api.unigateway.ai/v1/models \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` ```python from openai import OpenAI client = OpenAI( api_key="", base_url="https://api.unigateway.ai/v1", ) models = client.models.list() for model in models.data: print(f"{model.id}") ``` ```typescript import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1", }); const models = await client.models.list(); for (const model of models.data) { console.log(model.id); } ``` ## Billing Items UniGateway bills across the following item types: | Billing Item | Code | Description | |---|---|---| | Input tokens | `prompt` | Cost for processing input prompts | | Output tokens | `completion` | Cost for model-generated output | | Image processing | `image` | Cost for image processing or generation | | Base request fee | `request` | Base fee for API requests | | Web search | `web_search` | Cost for invoking web search functionality | | Cache read | `input_cache_read` | Cost for cache read operations | | Cache write | `input_cache_write` | Cost for cache write operations | | Cache write (5 min) | `input_cache_write_5_min` | Cost for 5-minute cache write operations | | Cache write (1 hour) | `input_cache_write_1_h` | Cost for 1-hour cache write operations | | Internal reasoning | `internal_reasoning` | Cost for internal reasoning computations | > Every call is metered and billed accurately. View per-call cost details and a detailed rate breakdown in the [Request Logs](./request-logs.md). ## Pricing Factors Several factors influence the final cost of an API call: | Factor | Impact | |---|---| | Model | Different models have different per-token rates | | Provider | The same model may have different rates across providers | | Token type | Input tokens and output tokens are priced separately | | Cache usage | Cache reads are significantly cheaper than full prompt processing | | Special features | Web search, image processing, and reasoning carry additional costs | | Tiered pricing | Some models offer volume discounts at higher usage levels | ## Cost Optimization ### Prompt Caching Cache-aware pricing can significantly reduce costs for repeated prompt patterns: | Item | Typical Savings vs. Prompt | |---|---| | `input_cache_read` | ~90% cheaper than full `prompt` processing | | `input_cache_write` | Slightly more expensive than prompt (write cost), but saves on subsequent reads | | `input_cache_write_5_min` | Lower write cost, shorter cache TTL | | `input_cache_write_1_h` | Higher write cost, longer cache TTL | ### Model Selection | Strategy | How | |---|---| | Use smaller models for simple tasks | Choose `gpt-5.4-nano` over `gpt-5.4` for straightforward prompts | ### Token Management | Strategy | Impact | |---|---| | Reduce prompt length | Fewer input tokens = lower cost | | Limit `max_tokens` | Prevent unexpectedly long (and expensive) outputs | | Summarize conversation history | Reduce context window usage in multi-turn conversations | | Use system prompts efficiently | Keep system prompts concise and cache-friendly | ## Understanding Your Bill ### Pay As You Go Charges are deducted from your account balance per token, per call. The cost for each call is calculated as: ``` Total Cost = (Input Tokens × Input Rate) + (Output Tokens × Output Rate) + Special Features Cost ``` View real-time spending in [Cost Analytics](./cost-analytics.md) and per-call details in [Request Logs](./request-logs.md). ## FAQ **Q: Why does the same model cost different amounts across providers?** A: Different providers may offer different pricing for the same model. Compare rates in the model pricing page. **Q: How can I estimate costs before making requests?** A: Check the model detail page in the console for per-token rates. For a rough estimate, multiply your expected input/output token counts by the respective rates. **Q: Are cache reads really cheaper?** A: Yes. Cache reads (`input_cache_read`) are typically ~90% cheaper than full prompt processing. See the model detail page for exact rates. **Q: Where can I see the exact cost of each call?** A: View per-call billing details in [Request Logs](./request-logs.md) by hovering over the Cost column, or review detailed rate breakdowns in the request details page. --- ## Code examples ### curl ```curl curl https://api.unigateway.ai/v1/models \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` ### python ```python from openai import OpenAI client = OpenAI(api_key="", base_url="https://api.unigateway.ai/v1") models = client.models.list() for model in models.data: print(model.id) ``` ### typescript ```typescript import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1" }); const models = await client.models.list(); for (const model of models.data) { console.log(model.id); } ``` ## Reference # Endpoint Compatibility Matrix > Category: Reference | Last updated: 2026-05-15 Confirmed UniGateway supported and unavailable endpoints, with concrete Gemini paths and known gaps. # Endpoint Compatibility Matrix Main UniGateway endpoint families. > Availability may vary by account plan, region, or feature rollout stage. ## Supported Endpoints Base root: `https://api.unigateway.ai` | Method | Path | Purpose | |---|---|---| | `GET` | `/v1/models` | list available models | | `POST` | `/v1/chat/completions` | chat and text generation | | `POST` | `/v1/completions` | legacy completions | | `POST` | `/v1/responses` | enhanced conversation interface | | `POST` | `/v1/responses/compact` | compact responses | | `POST` | `/v1/embeddings` | vector embeddings | | `POST` | `/v1/moderations` | content moderation | | `POST` | `/v1/images/generations` | image generation | | `POST` | `/v1/images/edits` | image editing | | `POST` | `/v1/audio/transcriptions` | audio transcription | | `POST` | `/v1/audio/translations` | audio translation | | `GET` | `/v1/realtime` | realtime WebSocket | | `POST` | `/v1/videos` | Sora video generation | | `GET` | `/v1/videos/{id}` | Sora video status | | `GET` | `/v1/videos/{id}/content` | Sora video download | | `POST` | `/v1/messages` | Anthropic-compatible requests | | `GET` | `/v1beta/models` | Gemini-style model discovery | | `POST` | `/v1beta/models/{model}:generateContent` | Gemini text, multimodal, and image generation | | `POST` | `/v1beta/models/{model}:streamGenerateContent?alt=sse` | Gemini SSE streaming | Gemini-compatible paths use UniGateway Bearer Token authentication. Do not send `x-goog-api-key` or a `key=` query parameter. ## Choose the Right Endpoint | Goal | Endpoint | Model example | |---|---|---| | OpenAI-compatible chat | `/v1/chat/completions` | `gpt-5.4` | | Claude-native messages | `/v1/messages` | `claude-sonnet-4-6` | | Gemini-native generation | `/v1beta/models/{model}:generateContent` | `gemini-3-pro-preview` | | OpenAI-compatible images | `/v1/images/generations` | `gpt-image-2` | | Gemini image generation | `/v1beta/models/{model}:generateContent` | `gemini-3-pro-image-preview` | | Sora video generation | `/v1/videos` | `sora-2` | | Seedance video generation | `/api/v3/contents/generations/tasks` (video.unigateway.ai) | `doubao-seedance-2.0-fast` | ## Unavailable Routes - `POST /v1/images/variations` - `POST /v1/audio/speech` - `GET/POST /v1/files` and related file operations - `POST/GET /v1/fine-tunes` and related fine-tune operations - `DELETE /v1/models/:model` Start integration with `GET /v1/models` and `POST /v1/chat/completions`. Add other endpoints after confirming they are enabled in your environment. Use `supported_endpoint_types` from `GET /v1/models` as a hint, then verify with a real request before production traffic. --- ## Code examples ### curl ```curl curl https://api.unigateway.ai/v1/models \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" ``` ### python ```python from openai import OpenAI client = OpenAI(api_key="", base_url="https://api.unigateway.ai/v1") print(client.models.list()) ``` ### typescript ```typescript import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1" }); console.log(await client.models.list()); ``` # Error Codes > Category: Reference | Last updated: 2026-05-15 Troubleshooting manual organized by problem category: auth, billing, model, params, rate limits, upstream. # Error Codes Complete reference for all UniGateway API error responses, with causes and resolution steps. ## Error Response Format All errors follow a unified JSON structure: ```json { "error": { "code": "429", "type": "rate_limit", "message": "Rate limit exceeded" } } ``` | Field | Type | Description | |---|---|---| | `code` | string | HTTP status code (as a string) | | `type` | string | Error category — use this for programmatic logic | | `message` | string | Human-readable description — for display only, may change | ## Error Quick Reference | HTTP Status | Error Type | Description | |---|---|---| | 400 | `invalid_params` | Invalid request parameters | | 401 | `auth_invalid` | Invalid or missing API key | | 402 | `insufficient_credit` | Account overdue | | 402 | `reject_no_credit` | Insufficient balance for this model | | 403 | `access_denied` | Insufficient permissions | | 403 | `safety_check_failed` | Upstream safety policy triggered | | 404 | `model_not_available` | Model not in current plan | | 404 | `invalid_model` | Model does not exist | | 404 | `model_not_supported` | Model does not support this API | | 413 | `prompt_too_long` | Request exceeds max context length | | 422 | `provider_unprocessable` | Upstream cannot process request | | 429 | `rate_limit` | Rate limit exceeded | | 500 | `internal_server_error` | Platform internal error | | 500 | `provider_api_error` | Upstream provider API error | | 500 | `provider_error` | Upstream provider service error | | 502 | `no_provider_available` | No upstream provider available | | 503 | `service_unavailable` | Service temporarily unavailable | | 524 | `provider_timeout` | Upstream provider timed out | ## Authentication Errors ### 401 — `auth_invalid` ```json { "error": { "code": "401", "type": "auth_invalid", "message": "Invalid API key provided" } } ``` **Cause:** The API key is missing, invalid, or expired. **Resolution:** 1. Verify the `Authorization: Bearer ` header is set correctly 2. Log in to the console to confirm the key has not been revoked 3. Ensure there are no extra spaces or line breaks in the key value 4. If using an environment variable, confirm it is set and exported ## Authorization Errors ### 403 — `access_denied` **Cause:** The API key does not have permission to access the requested resource. **Resolution:** 1. Check whether you are using a standard key for a management endpoint (management key required) 2. Verify your account has access to the requested model or feature ### 403 — `safety_check_failed` **Cause:** The input content was blocked by an upstream safety policy. **Resolution:** 1. Adjust the prompt, images, or tool input content 2. Remove descriptions that may trigger safety rules 3. Switch to a different model with different safety thresholds ## Billing Errors ### 402 — `insufficient_credit` **Cause:** The account balance is negative (overdue). **Resolution:** 1. Go to **Settings → Billing** to top up your balance ### 402 — `reject_no_credit` **Cause:** The account balance is zero or too low, and the requested model requires a positive balance. **Resolution:** 1. Top up your account balance 2. Switch to a model or plan that allows low-balance access ## Parameter Validation Errors ### 400 — `invalid_params` Returned when request parameters fail validation. Specific sub-cases: | Error Message | Cause | Solution | |---|---|---| | `Parameter model is required` | Missing or empty `model` field | Add a valid `model` value | | `Model {model} is not valid` | Model identifier does not exist | Check `GET /v1/models` for valid names | | `Parameter messages is required` | Missing `messages` field | Include a `messages` array | | `Parameter messages can not be empty` | `messages` is an empty array | Provide at least one message | | `Parameter messages can not contain null elements` | `messages` contains null entries | Remove null items | | `Parameter messages can not contain elements without content` | Message missing `content` field | Ensure every message has content | | `Parameter stream_options is not supported while stream is false` | `stream_options` set without `stream: true` | Set `stream: true` or remove `stream_options` | | `Parameter n greater than 1 is not supported` | `n` value exceeds 1 | Set `n` to 1 or omit it | | `Parameter temperature is not valid` | Negative `temperature` value | Use a value >= 0 | ### 413 — `prompt_too_long` **Cause:** The request body (including all messages) exceeds the model's maximum context length. **Resolution:** 1. Reduce the number or length of messages 2. Use a model that supports a larger context window 3. Summarize earlier conversation turns instead of including full history ### 422 — `provider_unprocessable` **Cause:** The request passed platform validation but the upstream provider could not process it. Typically occurs when the upstream has stricter requirements for field structure, tools, or multimodal inputs. **Resolution:** 1. Review the specific description in the `message` field 2. Check that all parameters are compatible with the selected model 3. Remove optional/advanced parameters and retry with a minimal request ## Model Availability Errors ### 404 — `model_not_available` **Cause:** The model exists but is not currently accessible for your account. **Resolution:** 1. Upgrade to a plan that includes the target model ### 404 — `invalid_model` **Cause:** The specified model identifier does not match any known model. **Resolution:** 1. Check for typos in the model name 2. Refer to `GET /v1/models` for valid identifiers ### 404 — `model_not_supported` **Cause:** The model exists but does not support the API endpoint being called. **Resolution:** 1. Check the model's detail page to confirm which APIs it supports 2. Switch to a model that supports the current API ## Rate Limiting Errors ### 429 — `rate_limit` **Cause:** The request rate has exceeded the allowed limit. This may originate from platform-level or upstream provider rate limiting. **Resolution:** 1. Reduce request frequency 2. Add delays between requests or use exponential backoff 3. Spread peak traffic more evenly over time ## Server and Upstream Errors ### 500 — `internal_server_error` **Cause:** An unexpected platform-side error occurred. **Resolution:** 1. Retry with exponential backoff — most 500 errors are transient 2. If the issue persists, contact support (via the console or support email) with the `X-UniGateway-RequestId` from response headers ### 500 — `provider_error` / `provider_api_error` **Cause:** The request reached the upstream provider, and the issue originated on the provider's side. **Resolution:** 1. Retry with exponential backoff 2. If persistent, contact support with `X-UniGateway-RequestId` ### 502 — `no_provider_available` **Cause:** No upstream provider is currently available to handle the request. Possible reasons: - The specified provider does not exist - All configured upstream providers for this model are experiencing failures - All providers failed after retries **Resolution:** 1. Relax or remove provider-specific routing rules 2. Wait briefly and retry 3. Retry the request ### 503 — `service_unavailable` **Cause:** The platform is temporarily unavailable, typically during maintenance or capacity events. **Resolution:** 1. Retry after a brief delay 2. Check the status page for ongoing incidents ### 524 — `provider_timeout` **Cause:** The upstream provider did not respond within the timeout window. **Resolution:** 1. Retry with exponential backoff 2. Consider using a faster model variant 3. Reduce prompt length if the model is approaching its processing limit ## Streaming-Specific Error Handling When using `stream: true`, errors manifest differently than in non-streaming requests. ### Errors During Streaming If an error occurs after streaming has started, the stream will terminate or emit a failure event within the stream. Clients should handle: - HTTP status code evaluation before stream consumption - Event stream completeness checks (look for `data: [DONE]`) - In-stream error event parsing - Connection interruption handling ### SSE Keep-Alive During long-running requests, you may receive SSE comments: ``` : UNIGATEWAY PROCESSING ``` This is **not an error** — it is a keep-alive signal sent periodically to prevent connection timeouts. Per the SSE specification, your client parser should ignore lines starting with `:`. ### Stream Disconnection If the client disconnects voluntarily (e.g., cancels the request), the server cleans up the in-progress stream. Client-initiated disconnections do not produce error responses. ## Error Handling Best Practices 1. **Always handle errors by the `type` field** — Use `type` for branching logic; `message` is for display only and may change 2. **Implement retry logic for retryable errors:** | Category | Status Codes | |---|---| | Retryable | 429, 500, 502, 503, 524 | | Retryable after fix | 402 | | Non-retryable | 400, 401, 403, 404, 413, 422 | 3. **Save `X-UniGateway-RequestId`** — Include this ID when contacting support to expedite resolution 4. **Handle streaming responses gracefully** — Always handle incomplete responses and connection interruptions 5. **Use exponential backoff for retries** — Gradually increase the interval (e.g., 1s → 2s → 4s → 8s), with random jitter to avoid the "thundering herd" effect ### Example Retry Logic (Python) ```python import time import random from openai import OpenAI client = OpenAI( api_key="", base_url="https://api.unigateway.ai/v1", ) max_retries = 3 for attempt in range(max_retries): try: resp = client.chat.completions.create( model="gpt-5.4", messages=[{"role": "user", "content": "Hello"}], ) print(resp.choices[0].message.content) break except Exception as e: if attempt < max_retries - 1: delay = min(0.3 * (2 ** attempt), 8) + random.uniform(0, 0.1) time.sleep(delay) else: raise ``` ## FAQ **Q: The same request works for some models but returns 404 for others.** A: Different plans include different model lists. Upgrade your plan for full model access. **Q: I occasionally see 500 errors. Should I be concerned?** A: Occasional 500 errors are normal in distributed systems. Implement automatic retry with exponential backoff. Contact support if the error rate stays consistently high. **Q: What does `UNIGATEWAY PROCESSING` mean during a streaming request?** A: This is an SSE keep-alive comment (prefixed with `:`), sent periodically to indicate the request is still being processed. It is not an error — your SSE client should ignore comment lines per the specification. **Q: How can I tell which upstream provider handled my request?** A: Provide the `X-UniGateway-RequestId` from the response headers to the support team for request tracing. --- ## Code examples ### curl ```curl curl https://api.unigateway.ai/v1/chat/completions \ -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{"model":"gpt-5.4","messages":[{"role":"user","content":"hello"}]}' -v ``` ### python ```python from openai import OpenAI client = OpenAI(api_key="", base_url="https://api.unigateway.ai/v1") resp = client.chat.completions.create(model="gpt-5.4", messages=[{"role":"user","content":"hello"}]) ``` ### typescript ```typescript import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.UNIGATEWAY_API_KEY, baseURL: "https://api.unigateway.ai/v1" }); const resp = await client.chat.completions.create({ model: "gpt-5.4", messages: [{ role: "user", content: "hello" }] }); ```