Docs/GuidesOpenAI compatibility

OpenAI compatibility

Two lines change: base URL and API key. Chat, embeddings, and audio endpoints are 1:1 compatible.

Wordcab exposes OpenAI-compatible endpoints. Existing OpenAI SDK code works by changing two lines — the base URL and the API key. This is the quickest migration path off a hosted OpenAI deployment.

Supported endpoints

OpenAI pathWordcab pathNotes
/v1/chat/completions/v1/chat/completionsStreaming, tools, JSON mode, structured outputs.
/v1/completions/v1/completionsLegacy completions API.
/v1/embeddings/v1/embeddingsMultiple models supported; see embeddings.
/v1/audio/transcriptions/v1/audio/transcriptionsWhisper-SDK compatible.
/v1/audio/speech/v1/audio/speechTTS; streaming supported.
/v1/models/v1/modelsLists everything the key can call.

Migration

# before
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# after
client = OpenAI(
    base_url="https://api.wordcab.com",
    api_key=os.environ["WORDCAB_API_KEY"],
)
// before
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

// after
const client = new OpenAI({
  baseURL: "https://api.wordcab.com",
  apiKey: process.env.WORDCAB_API_KEY,
});
# Many applications read these directly.
# Swap both values together.
export OPENAI_BASE_URL="https://api.wordcab.com"
export OPENAI_API_KEY=$WORDCAB_API_KEY

Model-name mapping

OpenAI model names (gpt-4o, whisper-1, tts-1) are not valid on Wordcab. Use Wordcab model ids instead. The common swaps:

OpenAIWordcab equivalent
gpt-4o-miniqwen3.5-4b
gpt-4oqwen3.5-4b (default) or deepseek-v3.2 (heavy reasoning)
gpt-4.1 classdeepseek-v3.2 or llama-3.3-70b
whisper-1qwen3-asr or whisper-large-v3
tts-1qwen3-tts
text-embedding-3-smallbge-m3 or e5-mistral-7b

Known differences

  • No image generation. Wordcab does not ship an image endpoint.
  • No logprobs beyond top-1. The full top-k logprob array is not returned.
  • Model availability is workload-specific. Not every model is provisioned for streaming; if a model lacks a streaming pool, stream=True returns 422. Check models.
  • Rate-limit headers differ. Wordcab returns X-RateLimit-* and Retry-After, not x-ratelimit-limit-requests.

Running against a self-hosted deployment

Self-hosted deployments expose the same paths at your ingress URL. Point the SDK at your cluster instead of api.wordcab.com — no code changes beyond the URL.

python
client = OpenAI(
    base_url="https://wordcab.apps.example.com",
    api_key=cluster_issued_token,
)