OpenAI compatibility

Two lines change: base URL and API key. Chat, embeddings, and audio endpoints are 1:1 compatible.

Package availability

Wordcab SDKs, CLI tools, Helm charts, model weights, and deployment packages are delivered directly to each customer for self-hosted installation. They are not publicly published package-manager artifacts, so install commands in these docs are placeholders until your Wordcab team provides your private package source or offline bundle.

Wordcab exposes OpenAI-compatible endpoints. Existing OpenAI SDK code works by changing two lines — the base URL and the API key. This is the quickest migration path off a hosted OpenAI deployment.

Supported endpoints

OpenAI path	Wordcab path	Notes
`/v1/chat/completions`	`/v1/chat/completions`	Streaming, tools, JSON mode, structured outputs.
`/v1/completions`	`/v1/completions`	Legacy completions API.
`/v1/embeddings`	`/v1/embeddings`	Multiple models supported; see embeddings.
`/v1/audio/transcriptions`	`/v1/audio/transcriptions`	Whisper-SDK compatible.
`/v1/audio/speech`	`/v1/audio/speech`	TTS; streaming supported.
`/v1/models`	`/v1/models`	Lists everything the key can call.

Migration

# before
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# after
client = OpenAI(
    base_url="https://api.wordcab.com",
    api_key=os.environ["WORDCAB_API_KEY"],
)

// before
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

// after
const client = new OpenAI({
  baseURL: "https://api.wordcab.com",
  apiKey: process.env.WORDCAB_API_KEY,
});

# Many applications read these directly.
# Swap both values together.
export OPENAI_BASE_URL="https://api.wordcab.com"
export OPENAI_API_KEY=$WORDCAB_API_KEY

Model-name mapping

OpenAI model names (gpt-4o, whisper-1, tts-1) are not valid on Wordcab. Use Wordcab model ids instead. The common swaps:

OpenAI	Wordcab equivalent
`gpt-4o-mini`	`qwen3.5-4b`
`gpt-4o`	`qwen3.5-4b` (default) or `deepseek-v3.2` (heavy reasoning)
`gpt-4.1` class	`deepseek-v3.2` or `llama-3.3-70b`
`whisper-1`	`qwen3-asr` or `whisper-large-v3`
`tts-1`	`qwen3-tts`
`text-embedding-3-small`	`bge-m3` or `e5-mistral-7b`

Known differences

No image generation. Wordcab does not ship an image endpoint.
No logprobs beyond top-1. The full top-k logprob array is not returned.
Model availability is workload-specific. Not every model is provisioned for streaming; if a model lacks a streaming pool, stream=True returns 422. Check models.
Rate-limit headers differ. Wordcab returns X-RateLimit-* and Retry-After, not x-ratelimit-limit-requests.

Running against a self-hosted deployment

Self-hosted deployments expose the same paths at your ingress URL. Point the SDK at your cluster instead of api.wordcab.com — no code changes beyond the URL.

python

client = OpenAI(
    base_url="https://wordcab.apps.example.com",
    api_key=cluster_issued_token,
)

← Previous

Making calls

Webhooks