Inbound SIP calls: caller hears silence; WebRTC works
Project: Dialogbrain (p_mu9nuejzw03)
Region: US East 1
Report time: 2026-04-21 ~22:44 UTC
Plan: Free (Agent Console available, 3/1000 agent minutes used)
Symptom
Inbound PSTN calls via Twilio SIP trunk → LiveKit Cloud → LiveKit Agents. The caller hears complete silence from the agent. The caller’s voice reaches the agent just fine (STT produces transcripts). This is one-way audio in the agent → caller direction only, and only over SIP. The same agent code over WebRTC (LiveKit Agent Console / our browser widget) plays agent audio correctly — so the agent itself is publishing audio properly.
Repro evidence
5 back-to-back SIP calls today, all ~32-33 sec (caller hangs up after hearing silence). All show SIP 200 OK, PCMU/8000 negotiated:
| Call ID | Started | Duration |
|---|---|---|
SCL_zYRkmmqRXTdm |
10:25:36 PM | 32s |
SCL_UpJeB5xHa8kS |
10:23:30 PM | 33s |
SCL_hE8KTLG5TxTA |
10:18:17 PM | 33s |
SCL_nxBu2XQUAPPS |
10:11:41 PM | 33s |
SCL_HtpZMRuLmssf |
10:08:33 PM | 33s |
Telephony dashboard shows “5 Calls with issues” — all 5 flagged.
Trunk: ST_NJDrUAzjrefP (inbound catch-all for Twilio)
Dispatch rule: SDR_woZ8hCtUBynb (dispatchRuleDirect, room = {{.SIPCallTo.User}})
Provider call IDs on file (e.g. CA886d9dfc41ce912f0d56bd1f5259914e).
What we verified on our side
-
Agent publishes audio correctly. Browser/WebRTC test of the same deployed worker (
A_WHXzBfHxVNj9) plays the TTS greeting audibly. Agent logs showtts metric: chars=49 audio_duration=5.39s provider=cartesia. -
Track publish options:
red = False,source = TrackSource.SOURCE_MICROPHONE. (We confirmed on a prior iteration that defaulting toSOURCE_UNSPECIFIEDplusaudio/redcaused the SIP edge to not subscribe — fixing both did not fix the silence.) -
RTP flow to Twilio is structurally healthy. PCAP for
SCL_zYRkmmqRXTdmshows 1497 outbound RTP packets (10.34.14.35:56903 → 168.86.136.92:12790):- All 216 bytes on wire (UDP payload 172 = 12 RTP hdr + 160 PCMU samples)
- Payload type = 0 (PCMU)
- Monotonic
seq0 → 1496, no gaps, no reorder - ~50 pps over 29.94 s
- But the exported PCAP is snaplen-truncated at 56 bytes, so we cannot inspect the mu-law samples to tell if they’re real audio or silent
0xFF. The return direction (Twilio → LiveKit) doesn’t appear in the export at all.
Questions / requests
- Is there a way to obtain a full (non-truncated) PCAP for one of these call IDs? Or alternatively, can LiveKit support pull it internally and confirm whether the outbound PCMU samples contain actual audio or are silent/zero-valued?
- Is there a known issue with Opus → PCMU transcoding on the SIP egress path for US East 1, or specifically when
enable_recording=trueon the agent job? (Our logs show recording is enabled by default viaAgentDispatch.) - Is Twilio’s RTP reaching the edge symmetrically? The export appears to omit Twilio->LiveKit packets; we’d like to confirm LiveKit is advertising the correct media IP/port in its SDP answer for our project.
Happy to share full agent logs, docker-compose, or run any further diagnostic scripts you suggest.