Quickstart
Install the SDK, authenticate, and run a transcription, a speech generation, and a chat completion. End to end in under five minutes.
Install
Pick a language. All three clients share the same surface and are generated from the same OpenAPI spec.
pip install wordcabnpm install @wordcab/sdk# nothing to installAuthenticate
Create an API key in the dashboard (Settings → API Keys) and export it. The SDKs read WORDCAB_API_KEY automatically.
bash
export WORDCAB_API_KEY=wc_live_xxxxxxxxxxxxxxxxTranscribe an audio file
Post an audio file to the batch transcripts endpoint and poll until it finishes. For real-time streaming, see the transcription guide.
from wordcab import Wordcab
client = Wordcab()
job = client.transcripts.create(
audio_url="https://example.com/call.wav",
model="qwen3-asr",
language="en",
diarize=True,
)
transcript = client.transcripts.wait(job.id)
print(transcript.text)import { Wordcab } from "@wordcab/sdk";
const client = new Wordcab();
const job = await client.transcripts.create({
audioUrl: "https://example.com/call.wav",
model: "qwen3-asr",
language: "en",
diarize: true,
});
const transcript = await client.transcripts.wait(job.id);
console.log(transcript.text);curl -X POST https://api.wordcab.com/api/v1/transcripts \\
-H "Authorization: Bearer $WORDCAB_API_KEY" \\
-H "Content-Type: application/json" \\
-d '{
"audio_url": "https://example.com/call.wav",
"model": "qwen3-asr",
"language": "en",
"diarize": true
}'
Generate speech
Send text and a voice to the speech endpoint. The response is an audio stream you can pipe directly to a file or a playback buffer.
with client.audio.speech.stream(
input="Hello from Wordcab. This call is being transcribed in real time.",
model="qwen3-tts",
voice="ember",
format="mp3",
) as response:
response.stream_to_file("hello.mp3")import fs from "node:fs";
const response = await client.audio.speech.create({
input: "Hello from Wordcab. This call is being transcribed in real time.",
model: "qwen3-tts",
voice: "ember",
format: "mp3",
});
fs.writeFileSync("hello.mp3", Buffer.from(await response.arrayBuffer()));curl -X POST https://api.wordcab.com/v1/audio/speech \\
-H "Authorization: Bearer $WORDCAB_API_KEY" \\
-H "Content-Type: application/json" \\
-d '{
"input": "Hello from Wordcab.",
"model": "qwen3-tts",
"voice": "ember",
"response_format": "mp3"
}' \\
--output hello.mp3
Run a chat completion
Use the OpenAI-compatible chat endpoint. Point the OpenAI SDK at https://api.wordcab.com and no application code changes.
from openai import OpenAI
client = OpenAI(
base_url="https://api.wordcab.com",
api_key="$WORDCAB_API_KEY",
)
completion = client.chat.completions.create(
model="qwen3.5-4b",
messages=[
{"role": "system", "content": "Summarize calls in two sentences."},
{"role": "user", "content": transcript.text},
],
)
print(completion.choices[0].message.content)import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.wordcab.com",
apiKey: process.env.WORDCAB_API_KEY,
});
const completion = await client.chat.completions.create({
model: "qwen3.5-4b",
messages: [
{ role: "system", content: "Summarize calls in two sentences." },
{ role: "user", content: transcript.text },
],
});
console.log(completion.choices[0].message.content);Next steps
- Build a voice agent — system prompts, tool use, and an outbound call.
- Wire up a phone number — Twilio, SIP, or any programmable-voice provider.
- API reference — every endpoint, parameter, and response shape.
- The
wordcabCLI — for scripts, operators, and self-hosted deployments.