语音/语音

通过已验证的 OpenAI 兼容语音接口进行音频转写和音频翻译。

语音

通过 OpenAI 兼容语音接口进行语音转写和音频翻译。

本文只覆盖当前生产模型目录中已验证可用的语音接口。

可用接口

任务方法路径模型示例
将音频转写为文字POST/v1/audio/transcriptionswhisper-1
将音频翻译为英文POST/v1/audio/translationswhisper-1

投入生产前,请用 GET /v1/models 确认模型可用。

语音转写

将音频转为文字。

项目
方法POST
路径/v1/audio/transcriptions
URLhttps://api.unigateway.ai/v1/audio/transcriptions
鉴权Authorization: Bearer $UNIGATEWAY_API_KEY
Content-Typemultipart/form-data

请求

curl https://api.unigateway.ai/v1/audio/transcriptions \
  -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \
  -F model="whisper-1" \
  -F file=@/path/to/audio.mp3

响应

{
  "text": "你好,这是语音转写服务的测试。"
}

参数

字段类型必填说明
modelstring模型 ID,如 whisper-1
filefile要转写的音频文件
languagestring语言代码,如 enzh
response_formatstring输出格式:textjsonverbose_jsonsrtvtttsv
temperaturenumber采样温度,范围 01

翻译

将音频直接翻译为英文。

项目
方法POST
路径/v1/audio/translations
URLhttps://api.unigateway.ai/v1/audio/translations
鉴权Authorization: Bearer $UNIGATEWAY_API_KEY
Content-Typemultipart/form-data

请求

curl https://api.unigateway.ai/v1/audio/translations \
  -H "Authorization: Bearer $UNIGATEWAY_API_KEY" \
  -F model="whisper-1" \
  -F file=@/path/to/audio.mp3

响应

{
  "text": "Hello, this is a test of the audio transcription service."
}

参数

字段类型必填说明
modelstring模型 ID,如 whisper-1
filefile要翻译的音频文件
response_formatstring输出格式:textjsonverbose_jsonsrtvtttsv
temperaturenumber采样温度,范围 01

Python

from openai import OpenAI

client = OpenAI(
    api_key="<YOUR_UNIGATEWAY_API_KEY>",
    base_url="https://api.unigateway.ai/v1",
)

with open("audio.mp3", "rb") as f:
    transcription = client.audio.transcriptions.create(
        model="whisper-1",
        file=f,
    )
print(transcription.text)

with open("audio.mp3", "rb") as f:
    translation = client.audio.translations.create(
        model="whisper-1",
        file=f,
    )
print(translation.text)

TypeScript

import OpenAI from "openai";
import fs from "fs";

const client = new OpenAI({
  apiKey: process.env.UNIGATEWAY_API_KEY,
  baseURL: "https://api.unigateway.ai/v1",
});

const transcription = await client.audio.transcriptions.create({
  model: "whisper-1",
  file: fs.createReadStream("audio.mp3"),
});
console.log(transcription.text);

const translation = await client.audio.translations.create({
  model: "whisper-1",
  file: fs.createReadStream("audio.mp3"),
});
console.log(translation.text);

常见错误

状态码原因处理方式
400文件无效、格式不支持或参数错误使用短 MP3、WAV、M4A 或 WebM 文件重试
401API Key 无效或缺失检查 Authorization 请求头
404模型不可用通过 GET /v1/models 确认 whisper-1
413文件过大压缩或拆分音频文件
429触发限流退避后重试

Example request

Run it in your stack

Pick the SDK style that matches your app and copy the snippet directly into your project.

from openai import OpenAI

client = OpenAI(api_key="<YOUR_UNIGATEWAY_API_KEY>", base_url="https://api.unigateway.ai/v1")
with open("audio.mp3", "rb") as f:
    resp = client.audio.transcriptions.create(model="whisper-1", file=f)
print(resp.text)