Inbound SIP calls: caller hears silence; WebRTC works (project p_mu9nuejzw03)

Inbound SIP calls: caller hears silence; WebRTC works

Project: Dialogbrain (p_mu9nuejzw03)
Region: US East 1
Report time: 2026-04-21 ~22:44 UTC
Plan: Free (Agent Console available, 3/1000 agent minutes used)

Symptom

Inbound PSTN calls via Twilio SIP trunk → LiveKit Cloud → LiveKit Agents. The caller hears complete silence from the agent. The caller’s voice reaches the agent just fine (STT produces transcripts). This is one-way audio in the agent → caller direction only, and only over SIP. The same agent code over WebRTC (LiveKit Agent Console / our browser widget) plays agent audio correctly — so the agent itself is publishing audio properly.

Repro evidence

5 back-to-back SIP calls today, all ~32-33 sec (caller hangs up after hearing silence). All show SIP 200 OK, PCMU/8000 negotiated:

Call ID Started Duration
SCL_zYRkmmqRXTdm 10:25:36 PM 32s
SCL_UpJeB5xHa8kS 10:23:30 PM 33s
SCL_hE8KTLG5TxTA 10:18:17 PM 33s
SCL_nxBu2XQUAPPS 10:11:41 PM 33s
SCL_HtpZMRuLmssf 10:08:33 PM 33s

Telephony dashboard shows “5 Calls with issues” — all 5 flagged.

Trunk: ST_NJDrUAzjrefP (inbound catch-all for Twilio)
Dispatch rule: SDR_woZ8hCtUBynb (dispatchRuleDirect, room = {{.SIPCallTo.User}})
Provider call IDs on file (e.g. CA886d9dfc41ce912f0d56bd1f5259914e).

What we verified on our side

  1. Agent publishes audio correctly. Browser/WebRTC test of the same deployed worker (A_WHXzBfHxVNj9) plays the TTS greeting audibly. Agent logs show tts metric: chars=49 audio_duration=5.39s provider=cartesia.

  2. Track publish options: red = False, source = TrackSource.SOURCE_MICROPHONE. (We confirmed on a prior iteration that defaulting to SOURCE_UNSPECIFIED plus audio/red caused the SIP edge to not subscribe — fixing both did not fix the silence.)

  3. RTP flow to Twilio is structurally healthy. PCAP for SCL_zYRkmmqRXTdm shows 1497 outbound RTP packets (10.34.14.35:56903 → 168.86.136.92:12790):

    • All 216 bytes on wire (UDP payload 172 = 12 RTP hdr + 160 PCMU samples)
    • Payload type = 0 (PCMU)
    • Monotonic seq 0 → 1496, no gaps, no reorder
    • ~50 pps over 29.94 s
    • But the exported PCAP is snaplen-truncated at 56 bytes, so we cannot inspect the mu-law samples to tell if they’re real audio or silent 0xFF. The return direction (Twilio → LiveKit) doesn’t appear in the export at all.

Questions / requests

  1. Is there a way to obtain a full (non-truncated) PCAP for one of these call IDs? Or alternatively, can LiveKit support pull it internally and confirm whether the outbound PCMU samples contain actual audio or are silent/zero-valued?
  2. Is there a known issue with Opus → PCMU transcoding on the SIP egress path for US East 1, or specifically when enable_recording=true on the agent job? (Our logs show recording is enabled by default via AgentDispatch.)
  3. Is Twilio’s RTP reaching the edge symmetrically? The export appears to omit Twilio->LiveKit packets; we’d like to confirm LiveKit is advertising the correct media IP/port in its SDP answer for our project.

Happy to share full agent logs, docker-compose, or run any further diagnostic scripts you suggest.

From the dashboard you can download the PCAP, e.g. for the SCL_zYRkmmqRXTdm call the link would be https://cloud.livekit.io/projects/p_/telephony/SCL_zYRkmmqRXTdm/inbound. That is the same log that LiveKit support have access to.

Is Twilio’s RTP reaching the edge symmetrically? The export appears to omit Twilio->LiveKit packets; we’d like to confirm LiveKit is advertising the correct media IP/port in its SDP answer for our project.

I also see the one-way RTP packets in the PCAP for the call I mentioned above. As you also say The return direction (Twilio -> LiveKit) doesn't appear in the export at all . To me, that is a smoking gun.

This kind of one-way audio issue comes up on the forum occasionally, and the typical root cause is a networking or firewall issue. @CWilson gave a very thorough answer in this post for diagnosing this kind of issue: