Start here
Wordcab is private voice AI infrastructure. The same stack that powers cloud API traffic is what ships into your VPC, your datacenter, or an airgapped environment. These docs cover both modes — the public API at api.wordcab.com and the self-hosted runtime you install with Helm or the operator.
Quickstart
Make your first request in under five minutes — transcription, speech, and a chat completion.
AuthAuthentication
API keys, scopes, rotation, and the pattern for short-lived tokens on self-hosted deployments.
BuildBuild a voice agent
System prompt, voice selection, tool use, and a first outbound call — end to end.
MigrateOpenAI compatibility
Point your existing OpenAI SDK at Wordcab and keep your application code unchanged.
What you can build
The Wordcab API groups into three product surfaces. They share authentication, billing, control plane, and deployment artifacts — but you can adopt them independently.
Transcription & speech
Streaming and batch STT (Qwen3-ASR, Voxtral Realtime, Cohere Transcribe 2B). Streaming TTS (Qwen3-TTS, Kokoro).
ThinkLLM inference & reasoning
Chat completions, embeddings, tool use, JSON mode. Gemma 4, Qwen3.5, DeepSeek V3.2, Llama 3.3.
AdaptEvaluation & fine-tuning
Prepare data, run held-out evals, fine-tune against your real audio, and validate before rollout.
Developer surfaces
Three equivalent ways to talk to the Wordcab control plane. Pick whichever fits the task.
REST API
OpenAPI 3.1, Bearer-token auth, JSON everywhere. OpenAI-compatible /v1 endpoints for chat, embeddings, and audio.
Python & TypeScript SDKs
Typed clients with retries, pagination, and streaming helpers baked in. Drop-in replacement for the OpenAI SDK when you want it.
CLICommand line
Scriptable wordcab CLI for transcription, speech, agents, deployments, and log streaming against any environment.
Deploy & operate
When you run Wordcab inside your own infrastructure — VPC, on-prem Kubernetes, airgap, or hybrid — the same API runs behind a Helm-installed control plane. Everything below is operator-facing.
Self-hosted overview
Deployment shapes, reference hardware, what ships with the chart, time to first call.
InstallHelm chart
Prerequisites, values.yaml, operator CRDs. wordcab deploy apply wraps it with preflight.
Kubernetes
EKS, AKS, GKE, OpenShift, RKE2 — per-distribution notes on ingress, storage, GPU operator.
OfflineAirgap installs
Signed bundles, Cosign verification, internal registry import, preflight.
Day 2Upgrades & rollback
Rolling upgrades, one-command rollback, cadence, and stability rules.
ObserveObservability
Prometheus, OpenTelemetry, structured logs. Six Grafana dashboards and an SLO alert pack ship in the chart.
AuthIdentity & SSO
SAML, OIDC, SCIM, workload identity, audit-to-SIEM. Configured at install time.
VoiceTelephony & SIP
Twilio media streams, native SIP for on-prem PBX, Genesys / Five9 / Zoom connectors.
FrameworksFramework integrations
Pipecat, LiveKit, Daily, Vapi, Retell, LangChain, LlamaIndex.
BackendsModel serving
vLLM (default), SGLang, Triton, ONNX Runtime. Backend choice is per pool.
ArchitectureDeployment shapes
Reference diagrams for VPC, on-prem, airgap, and hybrid. Context for the operator docs above.
ControlDeployments API
Programmatic management of environments, routing, and autoscaling.
Some pages — security review bundles, DPA/BAA templates, offline bundle contents — are shared under NDA during Pilot. Request docs access to get the full bundle.