Outbound SIP call: Agent speaks before callee’s phone rings — missing 180 Ringing in PCAP
Problem
We’re using LiveKit SIP trunk (via Twilio) for outbound calls. On certain destination numbers, the AI agent starts speaking before the callee’s phone has even started ringing. The callee hears the tail end of the first speech when they pick up, or misses it entirely.
We captured PCAPs of two calls from the same LiveKit SIP trunk to compare:
- Call A (problematic): Agent spoke before the callee’s phone rang
- Call B (normal): Agent spoke only after the callee answered
Both calls originate from the same LiveKit SIP endpoint (3tqm7ro6kbs.sip.livekit.cloud:9000) via Twilio (xxxxxx.pstn.twilio.com).
Our outbound call flow (code-level context)
Our application uses create_sip_participant with wait_until_answered=True to place the outbound call:
sip_participant_response = await livekit_api.sip.create_sip_participant(
api.CreateSIPParticipantRequest(
sip_trunk_id=trunk_id,
room_name=room_name,
sip_call_to=phone_number,
sip_number=from_number,
participant_identity=phone_number,
play_dialtone=True,
play_ringtone=True,
wait_until_answered=True,
)
)
Only after create_sip_participant returns successfully (i.e., the call is “answered” per SIP 200 OK) does our code proceed to start initiate the agent’s response. So our application correctly waits for the call to be answered before speaking — the issue is that the SIP 200 OK arrives before the callee’s phone actually starts ringing.
PCAP Evidence
Call A — Agent spoke too early (no 180 Ringing)
T=0.000s INVITE → Twilio
T=0.002s 100 Trying
T=0.023s 407 Proxy Auth
T=0.027s INVITE (with auth)
T=0.028s 100 Trying
← ** No 180 Ringing **
T=3.016s 200 OK ← Call "answered" in ~3s (Server: Twilio, Session Name: Twilio Media Gateway)
T=3.034s RTP starts ← Both directions, agent speaks immediately
Key observations:
- No
180 Ringingresponse between100 Tryingand200 OK 200 OKarrives only ~3 seconds after the authenticated INVITE — too fast for a human to answer- RTP begins 18ms after 200 OK with bidirectional audio immediately
- The 200 OK comes from
Server: TwiliowithSession Name: Twilio Media Gateway - 200 OK has SDP with
sendrecv, codec PCMU, media endpoint168.86.138.29:12676
Call B — Normal behavior (180 Ringing present)
T=0.000s INVITE → Twilio
T=0.001s 100 Trying
T=0.021s 407 Proxy Auth
T=0.025s INVITE (with auth)
T=0.026s 100 Trying
T=1.627s 180 Ringing ← Phone is ringing (no SDP, Content-Length: 0)
T=12.449s 200 OK ← Callee answers after ~11s of ringing
T=12.458s RTP starts ← Agent speaks only now
T=39.721s BYE ← Normal call end
Key observations:
180 Ringingarrives at T=1.6s (no SDP body, signaling-only)200 OKarrives at T=12.4s — consistent with a human answering after several rings- RTP begins 9ms after 200 OK
- 200 OK has SDP with
sendrecv, codec PCMU, media endpoint168.86.139.31:14814
Side-by-side comparison
| Call A (problematic) | Call B (normal) | |
|---|---|---|
| 180 Ringing | Absent | Present at T=1.6s |
| 200 OK timing | T=3.0s (~3s post-INVITE) | T=12.4s (~12s post-INVITE) |
| First RTP packet | T=3.034s | T=12.458s |
| Agent spoke | Immediately on 200 OK | Immediately on 200 OK |
| Callee experience | Heard disclosure before phone rang | Normal — heard disclosure after picking up |
Twilio recording confirms in-band ringback after 200 OK
We confirmed via the Twilio call recording for Call A that ringback tone (ringing sound) is audible in the audio after the SIP 200 OK was received. This means:
- Twilio (or the downstream carrier) sent a 200 OK at T=3s, establishing the media path
- Ringback tone was then played in-band over RTP — the callee’s phone was still ringing
- Our agent received the 200 OK, treated the call as answered, and started speaking
- The agent’s speech and the in-band ringback overlapped — the callee had not yet picked up
Analysis
Our application behaves correctly in both cases — it waits for create_sip_participant (with wait_until_answered=True) to return, then starts agent speech after the SIP 200 OK is received.
In Call A, the SIP 200 OK arrives from Server: Twilio / Twilio Media Gateway after only 3 seconds with no prior 180 Ringing, and the Twilio recording confirms that in-band ringback was still playing after the 200 OK. The PCAP only captures the LiveKit ↔ Twilio leg, so we cannot see what is happening between Twilio and the downstream terminating carrier. Both calls use the same Twilio media IP range (168.86.138.x / 168.86.139.x) and SIP proxy (54.172.60.3 — AWS ec2-54-172-60-3.compute-1.amazonaws.com).
Questions
- Has anyone else encountered this with Twilio SIP trunking where certain destination numbers receive a 200 OK without a prior 180 Ringing?
- Is there a recommended way to handle this “early answer / false connect” scenario?
- Should this be raised with Twilio directly, given that the 200 OK originates from their media gateway?
Any guidance would be appreciated. Happy to share PCAPs privately if needed.