Voice agents

A Wordcab agent is a system prompt, a voice, an LLM, and a set of tools. Create one, attach a phone number, and start taking calls.

Package availability

Wordcab SDKs, CLI tools, Helm charts, model weights, and deployment packages are delivered directly to each customer for self-hosted installation. They are not publicly published package-manager artifacts, so install commands in these docs are placeholders until your Wordcab team provides your private package source or offline bundle.

A Wordcab agent is a named configuration: a system prompt, a voice, an LLM, a set of tools, and call-lifecycle settings. Once created, an agent can handle outbound or inbound calls, join meetings via SIP or WebRTC, or run as a text-only chat endpoint.

Create an agent

agent = client.agents.create(
    name="Support Agent",
    system_prompt=(
        "You are a support agent for Contoso Health. "
        "Verify the caller, answer benefits questions, and book a callback if you cannot help."
    ),
    voice_id="ember",
    llm_model="qwen3.5-4b",
    stt_model="voxtral-realtime",
    language="en",
    interruption_threshold=0.5,
    first_message="Thanks for calling Contoso. How can I help you today?",
    tools=[
        {"type": "function", "function": { ... } },
    ],
    context_window=20,
)
print(agent.id)

const agent = await client.agents.create({
  name: "Support Agent",
  systemPrompt: "You are a support agent for Contoso Health...",
  voiceId: "ember",
  llmModel: "qwen3.5-4b",
  sttModel: "voxtral-realtime",
  language: "en",
  interruptionThreshold: 0.5,
  firstMessage: "Thanks for calling Contoso. How can I help you today?",
  tools: [...],
});

System-prompt variables

Use {{variable}} placeholders in the system prompt; pass values through the context object when you start a call.

python

system_prompt = \"\"\"You are calling {customer_name} about order {order_number}.
The order contains: {items}. Delivery: {delivery_date}.\"\"\"

client.agents.calls.create(
    agent_id=agent.id,
    phone_number="+14155551234",
    context={
        "customer_name": "Jane",
        "order_number": "ORD-12345",
        "items": ["Widget Pro", "Gadget Plus"],
        "delivery_date": "April 30",
    },
)

Tool use

Agents call tools the same way chat completions do. Register tools on the agent, and implement the handlers in your backend. Wordcab will dispatch a webhook to your tool_url when a tool is invoked.

json

{
  "type": "function",
  "function": {
    "name": "lookup_member",
    "description": "Find a member by phone number.",
    "parameters": {
      "type": "object",
      "properties": {"phone": {"type": "string"},
      "required": ["phone"]
    }
  }
}

Handler

python

from flask import Flask, request, jsonify

app = Flask(__name__)

@app.post("/tools/lookup_member")
def lookup_member():
    body = request.get_json()
    phone = body["arguments"]["phone"]
    member = db.get_member_by_phone(phone)
    return jsonify({"member_id": member.id, "plan": member.plan})

Wordcab posts to tool_url with {"name": "...", "arguments": {...}, "call_id": "..."}. The response body is passed back to the model. Add HMAC signature verification — see Webhooks.

Start a call

Outbound:

python

call = client.agents.calls.create(
    agent_id=agent.id,
    phone_number="+14155551234",
    context={"customer_name": "Jane"},
    max_duration=300,
    record=True,
)

Inbound: attach a phone number to the agent in the dashboard (or via client.agents.update(agent_id, phone_numbers=[...])). Incoming calls to that number are answered by the agent.

Monitor & end

Poll for state, or subscribe to call.started, call.completed, and call.failed webhooks.

python

call = client.calls.get(call.id)
print(call.status)            # initiating | ringing | in_progress | completed | failed | no_answer
# ...later
client.calls.end(call.id)     # force-end a live call

Transcripts & recordings

Every call produces a transcript and, when record=True, an audio recording stored in your configured object store (in cloud) or your VPC bucket / on-prem MinIO (self-hosted).

python

tx = client.transcripts.get(call.transcript_id)
for u in tx.utterances:
    who = "Agent" if u.speaker == 0 else "Caller"
    print(f"{who}: {u.text}")

Test your agent before you call real numbers

The Gym runs your agent against a suite of scripted callers and asserts on outputs — tool calls made, facts extracted, tone. Gate every agent change behind a green Gym run.

← Previous

Chat & reasoning

Making calls