OpenAI compatibility
Two lines change: base URL and API key. Chat, embeddings, and audio endpoints are 1:1 compatible.
Wordcab exposes OpenAI-compatible endpoints. Existing OpenAI SDK code works by changing two lines — the base URL and the API key. This is the quickest migration path off a hosted OpenAI deployment.
Supported endpoints
| OpenAI path | Wordcab path | Notes |
|---|---|---|
/v1/chat/completions | /v1/chat/completions | Streaming, tools, JSON mode, structured outputs. |
/v1/completions | /v1/completions | Legacy completions API. |
/v1/embeddings | /v1/embeddings | Multiple models supported; see embeddings. |
/v1/audio/transcriptions | /v1/audio/transcriptions | Whisper-SDK compatible. |
/v1/audio/speech | /v1/audio/speech | TTS; streaming supported. |
/v1/models | /v1/models | Lists everything the key can call. |
Migration
# before
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# after
client = OpenAI(
base_url="https://api.wordcab.com",
api_key=os.environ["WORDCAB_API_KEY"],
)// before
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
// after
const client = new OpenAI({
baseURL: "https://api.wordcab.com",
apiKey: process.env.WORDCAB_API_KEY,
});# Many applications read these directly.
# Swap both values together.
export OPENAI_BASE_URL="https://api.wordcab.com"
export OPENAI_API_KEY=$WORDCAB_API_KEYModel-name mapping
OpenAI model names (gpt-4o, whisper-1, tts-1) are not valid on Wordcab. Use Wordcab model ids instead. The common swaps:
| OpenAI | Wordcab equivalent |
|---|---|
gpt-4o-mini | qwen3.5-4b |
gpt-4o | qwen3.5-4b (default) or deepseek-v3.2 (heavy reasoning) |
gpt-4.1 class | deepseek-v3.2 or llama-3.3-70b |
whisper-1 | qwen3-asr or whisper-large-v3 |
tts-1 | qwen3-tts |
text-embedding-3-small | bge-m3 or e5-mistral-7b |
Known differences
- No image generation. Wordcab does not ship an image endpoint.
- No
logprobsbeyond top-1. The full top-k logprob array is not returned. - Model availability is workload-specific. Not every model is provisioned for streaming; if a model lacks a streaming pool,
stream=Truereturns 422. Check models. - Rate-limit headers differ. Wordcab returns
X-RateLimit-*andRetry-After, notx-ratelimit-limit-requests.
Running against a self-hosted deployment
Self-hosted deployments expose the same paths at your ingress URL. Point the SDK at your cluster instead of api.wordcab.com — no code changes beyond the URL.
python
client = OpenAI(
base_url="https://wordcab.apps.example.com",
api_key=cluster_issued_token,
)