Telephony & SIP

Three ways to get voice in and out: programmable-voice over WebSocket, native SIP, or CCaaS connectors. Same media pipeline underneath.

Package availability

Wordcab SDKs, CLI tools, Helm charts, model weights, and deployment packages are delivered directly to each customer for self-hosted installation. They are not publicly published package-manager artifacts, so install commands in these docs are placeholders until your Wordcab team provides your private package source or offline bundle.

Three ways to get voice in and out of Wordcab: (1) programmable-voice providers over WebSocket, (2) native SIP, (3) CCaaS connectors. The chart's SIP gateway sub-chart handles on-prem PBX; the media layer is the same regardless of transport.

Twilio / Telnyx / Plivo media streams

Bridge the provider's media stream into Wordcab's WebSocket endpoint. Typical Twilio setup:

xml

<!-- Twilio voice webhook returns this TwiML -->
<Response>
  <Connect>
    <Stream url="wss://wordcab.apps.example.com/v1/media/twilio?agent_id=agent_abc" />
  </Connect>
</Response>

Wordcab receives μ-law / 8 kHz frames over WebSocket, runs STT → LLM → TTS, and writes audio frames back on the same stream. Reference apps for Twilio, Telnyx, Plivo, and Zoom Phone ship in the chart under examples/telephony/.

Native SIP (on-prem PBX)

For Avaya, Genesys, Cisco UCM, FreeSWITCH, or any SIP trunk, deploy the SIP gateway sub-chart. It speaks standard SIP + RTP / SRTP and sits inside the customer network, registered with the PBX.

bash

helm install wordcab-sip wordcab/sip-gateway \
  --namespace wordcab \
  --set trunk.host=pbx.internal.example.com \
  --set trunk.transport=tls \
  --set trunk.codec=g711a \
  --set trunk.srtp.enabled=true \
  --set agents.defaultId=agent_abc

Codec support

G.711 μ-law / A-law (8 kHz) — default telephony.
G.722 (16 kHz) — wideband where available.
Opus (variable) — for WebRTC / SIP-over-WebSocket.

DTMF and transfer

RFC 2833 DTMF is captured and available to agent tools. Blind and attended transfer (SIP REFER) are supported — expose them as agent tools and the LLM can route calls to a human or a different queue.

CCaaS connectors

Platform	Protocol	Status
Genesys Cloud	AudioHook	Supported — real-time STT, QA, and redaction on live calls.
Five9	VoiceStream	Supported — transcripts and QA signals delivered via webhook or Kafka back to Five9 reporting.
NICE CXone	Real-time Audio Streaming	On the roadmap (Q3 2026).
Zoom Phone	RTMS media gateway	Supported — business-communications workflows.

Recordings and retention

When record: true is set on an agent, full-call audio is written to the configured object store. On self-hosted, this is your bucket / MinIO / NFS. Retention is set per deployment — default 30 days on cloud, configurable on self-hosted.

Consent

Recording-consent rules vary by jurisdiction. Two-party-consent states, HIPAA contexts, and regulated industries each have different requirements. Wordcab does not decide this for you — configure the agent's opening line and your retention policy to match your compliance posture.

← Previous

Identity & SSO

Framework integrations