通过已验证的 OpenAI 兼容语音接口进行音频转写和音频翻译。

语音

通过 OpenAI 兼容语音接口进行语音转写和音频翻译。

本文只覆盖当前生产模型目录中已验证可用的语音接口。

可用接口

任务	方法	路径	模型示例
将音频转写为文字	`POST`	`/v1/audio/transcriptions`	`whisper-1`
将音频翻译为英文	`POST`	`/v1/audio/translations`	`whisper-1`

投入生产前，请用 GET /v1/models 确认模型可用。

语音转写

将音频转为文字。

项目	值
方法	`POST`
路径	`/v1/audio/transcriptions`
URL	`https://api.unigateway.ai/v1/audio/transcriptions`
鉴权	`Authorization: Bearer $UNIGATEWAY_API_KEY`
Content-Type	`multipart/form-data`

请求

curl https://api.unigateway.ai/v1/audio/transcriptions \
  -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \
  -F model="whisper-1" \
  -F file=@/path/to/audio.mp3

响应

{
  "text": "你好，这是语音转写服务的测试。"
}

参数

字段	类型	必填	说明
`model`	string	是	模型 ID，如 `whisper-1`
`file`	file	是	要转写的音频文件
`language`	string	否	语言代码，如 `en`、`zh`
`response_format`	string	否	输出格式：`text`、`json`、`verbose_json`、`srt`、`vtt`、`tsv`
`temperature`	number	否	采样温度，范围 `0` 到 `1`

翻译

将音频直接翻译为英文。

项目	值
方法	`POST`
路径	`/v1/audio/translations`
URL	`https://api.unigateway.ai/v1/audio/translations`
鉴权	`Authorization: Bearer $UNIGATEWAY_API_KEY`
Content-Type	`multipart/form-data`

请求

curl https://api.unigateway.ai/v1/audio/translations \
  -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \
  -F model="whisper-1" \
  -F file=@/path/to/audio.mp3

响应

{
  "text": "Hello, this is a test of the audio transcription service."
}

参数

字段	类型	必填	说明
`model`	string	是	模型 ID，如 `whisper-1`
`file`	file	是	要翻译的音频文件
`response_format`	string	否	输出格式：`text`、`json`、`verbose_json`、`srt`、`vtt`、`tsv`
`temperature`	number	否	采样温度，范围 `0` 到 `1`

Python

from openai import OpenAI

client = OpenAI(
    api_key="<YOUR_UNIGATEWAY_API_KEY>",
    base_url="https://api.unigateway.ai/v1",
)

with open("audio.mp3", "rb") as f:
    transcription = client.audio.transcriptions.create(
        model="whisper-1",
        file=f,
    )
print(transcription.text)

with open("audio.mp3", "rb") as f:
    translation = client.audio.translations.create(
        model="whisper-1",
        file=f,
    )
print(translation.text)

TypeScript

import OpenAI from "openai";
import fs from "fs";

const client = new OpenAI({
  apiKey: process.env.UNIGATEWAY_API_KEY,
  baseURL: "https://api.unigateway.ai/v1",
});

const transcription = await client.audio.transcriptions.create({
  model: "whisper-1",
  file: fs.createReadStream("audio.mp3"),
});
console.log(transcription.text);

const translation = await client.audio.translations.create({
  model: "whisper-1",
  file: fs.createReadStream("audio.mp3"),
});
console.log(translation.text);

常见错误

状态码	原因	处理方式
`400`	文件无效、格式不支持或参数错误	使用短 MP3、WAV、M4A 或 WebM 文件重试
`401`	API Key 无效或缺失	检查 `Authorization` 请求头
`404`	模型不可用	通过 `GET /v1/models` 确认 `whisper-1`
`413`	文件过大	压缩或拆分音频文件
`429`	触发限流	退避后重试

语音#

可用接口#

语音转写#

请求#

响应#

参数#

翻译#

请求#

响应#

参数#

Python#

TypeScript#

常见错误#

Run it in your stack

语音

可用接口

语音转写

请求

响应

参数

翻译

请求

响应

参数

Python

TypeScript

常见错误