### Bug Description
We're seeing intermittent issues around SIP participant mig…ration/reconnect while running `livekit-agents` (Python) on Kubernetes.
In LiveKit Analytics (LiveKit cloud dashboard) the SIP identity often shows up twice in the same room (leave + immediate rejoin), and in our logs we frequently see:
`livekit::rtc_engine - received session close: "server request to leave" Migration Resume`
After that point we see two related behaviors:
1) `ParticipantDisconnected` fires but `participant.disconnect_reason` is `null`
- In Analytics, we see the SIP participant leave labeled "Participant left (MIGRATION)" and the agent participant leave labeled "Participant left (STATE_MISMATCH)".
- We currently use `disconnect_reason` to decide whether to treat the disconnect as a migration/reconnect vs. a real hangup. When it's `null` we can't reliably distinguish the two, and sometimes end the call early by deleting the room.
2) In other cases we never get `ParticipantDisconnected` at all
- Shortly after Migration Resume we see `signal_event taking too much time: Answer(SessionDescription ...)`
- Then ~60-90s later the agent job is force-killed because it doesn't exit cleanly
- In some cases the call indefinitely goes into `pc_state failed` and does not recover unless manually process is killed
Example logs for a single call (disconnect_reason is null):
```json
{"message": "livekit::rtc_engine:474:livekit::rtc_engine - received session close: \"server request to leave\" Migration Resume", "level": "WARNING", "name": "livekit", "worker_id": "AW_mK6w97YC5D3J", "pid": 270148, "timestamp": "2026-01-23T20:03:21.285929+00:00"}
{"message": "livekit::rtc_engine:474:livekit::rtc_engine - received session close: \"signal client closed: \\\"stream closed\\\"\" UnknownReason Resume", "level": "WARNING", "name": "livekit", "worker_id": "AW_mK6w97YC5D3J", "timestamp": "2026-01-23T20:03:21.286151+00:00"}
{"message": "livekit::rtc_engine:773:livekit::rtc_engine - resuming connection... attempt: 0", "level": "ERROR", "name": "livekit", "worker_id": "AW_mK6w97YC5D3J", "pid": 270148, "timestamp": "2026-01-23T20:03:21.286258+00:00"}
{"levelname": "INFO", "process": 270148, "event": "Participant Disconnected", "participant": "rtc.RemoteParticipant(sid=PA_KmSLmD9jKnjv, identity=sip_+XXXXXX, name=Phone +XXXXXX)", "kind": 3, "identity": "sip_+XXXX", "disconnect_reason": null, "timestamp": "2026-01-23T20:03:24.347553+00:00"}
```
Example logs in the "stuck" case:
```json
{"message": "livekit::rtc_engine:474:livekit::rtc_engine - received session close: \"server request to leave\" Migration Resume", "level": "WARNING", "name": "livekit", "pid": 139944, "job_id": "AJ_vau2WWRWuKjG", "timestamp": "2026-01-30T22:09:45.271724+00:00"}
{"message": "livekit::rtc_engine:474:livekit::rtc_engine - received session close: \"signal client closed: \\\"stream closed\\\"\" UnknownReason Resume", "level": "WARNING", "name": "livekit", "pid": 139944, "job_id": "AJ_vau2WWRWuKjG", "timestamp": "2026-01-30T22:09:45.271995+00:00"}
{"message": "livekit::rtc_engine:773:livekit::rtc_engine - resuming connection... attempt: 0", "level": "ERROR", "name": "livekit", "pid": 139944, "timestamp": "2026-01-30T22:09:45.272345+00:00"}
{"message": "livekit::rtc_engine::rtc_session:715:livekit::rtc_engine::rtc_session - signal_event taking too much time: Answer(SessionDescription { r#type: \"answer\", sdp: \"v=0\\r\\no=- 6603763378326092889 1769810986 IN IP4 0.0.0.0\\r\\ns=-\\r\\nt=0 0\\r\\na=msid-semantic:WMS *\\r\\na=fingerprint:sha-256 2F:7D:65:03:79:2A:C8:E8:2A:DB:C8:EA:63:80:24:D0:E1:A5:0A:99:ED:84:F8:5E:64:3C:E8:38:1E:EC:74:5B\\r\\na=ice-lite\\r\\na=extmap-allow-mixed\\r\\na=group:BUNDLE 0 1\\r\\nm=audio 9 UDP/TLS/RTP/SAVPF 63 111 0 8\\r\\nc=IN IP4 0.0.0.0\\r\\na=setup:active\\r\\na=mid:0\\r\\na=ice-ufrag:moWOANYkpxRtTanA\\r\\na=ice-pwd:tUgqKFpgfzQQFgccENCyKxxIPUQJOHnC\\r\\na=rtcp-mux\\r\\na=rtcp-rsize\\r\\na=rtpmap:63 red/48000/2\\r\\na=fmtp:63 111/111\\r\\na=rtpmap:111 opus/48000/2\\r\\na=fmtp:111 minptime=10;useinbandfec=1;usedtx=1\\r\\na=rtpmap:0 PCMU/8000\\r\\na=rtpmap:8 PCMA/8000\\r\\na=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level\\r\\na=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid\\r\\na=recvonly\\r\\nm=application 9 UDP/DTLS/SCTP webrtc-datachannel\\r\\nc=IN IP4 0.0.0.0\\r\\na=setup:active\\r\\na=mid:1\\r\\na=sendrecv\\r\\na=sctp-port:5000\\r\\na=max-message-size:65535\\r\\na=ice-ufrag:moWOANYkpxRtTanA\\r\\na=ice-pwd:tUgqKFpgfzQQFgccENCyKxxIPUQJOHnC\\r\\n\", id: 0, mid_to_track_id: {} })", "level": "ERROR", "name": "livekit", "pid": 139944, "job_id": "AJ_vau2WWRWuKjG", "timestamp": "2026-01-30T22:09:56.411066+00:00"}
{"message": "livekit::rtc_engine::rtc_session:1041:livekit::rtc_engine::rtc_session - Subscriber pc state failed", "level": "ERROR", "name": "livekit", "pid": 139944, "job_id": "AJ_vau2WWRWuKjG", "room_id": "RM_RWpxP322Ynvs", "timestamp": "2026-01-30T22:10:05.096736+00:00"}
{"message": "livekit::rtc_engine:474:livekit::rtc_engine - received session close: \"pc_state failed\" UnknownReason Resume", "level": "WARNING", "name": "livekit", "pid": 139944, "job_id": "AJ_vau2WWRWuKjG", "room_id": "RM_RWpxP322Ynvs", "timestamp": "2026-01-30T22:10:05.097985+00:00"}
{"message": "livekit::rtc_engine:773:livekit::rtc_engine - resuming connection... attempt: 0", "level": "ERROR", "name": "livekit", "pid": 139944, "job_id": "AJ_vau2WWRWuKjG", "room_id": "RM_RWpxP322Ynvs", "timestamp": "2026-01-30T22:10:05.098116+00:00"}
{"message": "process did not exit in time, killing process", "level": "ERROR", "name": "livekit.agents", "pid": 139944, "job_id": "AJ_vau2WWRWuKjG", "room_id": "RM_RWpxP322Ynvs", "timestamp": "2026-01-30T22:11:06.059865+00:00"}
{"message": "killing process", "level": "INFO", "name": "livekit.agents", "pid": 139944, "job_id": "AJ_vau2WWRWuKjG", "room_id": "RM_RWpxP322Ynvs", "timestamp": "2026-01-30T22:11:06.060096+00:00"}
{"message": "sending SIGUSR1 signal to process", "level": "INFO", "name": "livekit.agents", "pid": 139944, "job_id": "AJ_vau2WWRWuKjG", "room_id": "RM_RWpxP322Ynvs", "timestamp": "2026-01-30T22:11:06.060173+00:00"}
{"message": "process exited with non-zero exit code -10", "level": "ERROR", "name": "livekit.agents", "pid": 139944, "job_id": "AJ_vau2WWRWuKjG", "room_id": "RM_RWpxP322Ynvs", "timestamp": "2026-01-30T22:11:06.085801+00:00"}
```
Questions:
1. What typically triggers a SIP participant “Migration” in LiveKit Cloud—SIP/dialer behavior (trunk/provider, SIP edge, network) vs LiveKit-initiated migration? Are there recommended configuration changes or best practices to reduce how often this happens (or avoid it entirely)? Also, when migration happens, ParticipantDisconnected.disconnect_reason is sometimes null (which is derived from “Unknown” status as [per code here](https://github.com/livekit/python-sdks/blob/main/livekit-rtc/livekit/rtc/participant.py#L142C1-L144C1)), which makes it hard to safely decide whether to keep the call alive—what’s the correct signal or recommended handling here?
2. What does “STATE_MISMATCH” mean in this context (what state is mismatching, and why)? What are the common root causes, and are there specific mitigations we can apply to reduce the frequency of these events?
### Expected Behavior
- If the server considers a SIP participant leave to be migration/reconnect, the SDK callback should include a non-null `disconnect_reason` (or another reliable signal) so apps can avoid treating it as a real hangup.
- If Analytics shows SIP participant "MIGRATION" or agent participant "STATE_MISMATCH", we'd expect the SDK callback to report something consistent.
- Possible solutions for webrtc Signaling (`Answer(SessionDescription ...)`) during/after Migration Resume to be handled in RTC package
### Reproduction Steps
```bash
We don't have a deterministic minimal repro yet (production-only so far), but we have provided additional room/job IDs and full logs.
- I have ensured this is not a networking issue on our end*
- I have also looked at SIP PCAP logs available on LiveKit dashboard but didn't notice any inconsistency*
```
### Operating System
Linux (containerized on AWS EKS, Kubernetes, us-east-2, python 3.11)
### Models Used
STT: Deepgram; TTS: ElevenLabs; VAD: Silero
### Package Versions
```bash
livekit==1.0.23
livekit-agents==1.3.10
livekit-api==1.0.7
livekit-blingfire==1.1.0
livekit-plugins-anthropic==1.3.10
livekit-plugins-cartesia==1.3.10
livekit-plugins-deepgram==1.3.10
livekit-plugins-elevenlabs==1.3.10
livekit-plugins-google==1.3.10
livekit-plugins-noise-cancellation==0.2.5
livekit-plugins-openai==1.3.10
livekit-plugins-silero==1.3.10
livekit-plugins-turn-detector==1.3.10
livekit-protocol==1.1.1
```
### Session/Room/Call IDs
Below are Room IDs:
LiveKit Cloud Project ID: `p_3tqm7ro6kbs`
#### Case A - `disconnect_reason: null`:
Date | Job ID | Room ID | Trunk Provider
-- | -- | -- | --
2026-01-30 | AJ_LgybQoHWHrkk | RM_qbb2qTaMaNji | TCN
2026-01-30 | AJ_wvD4L2VH6daA | RM_dkqfDw3QVRBR | TCN
2026-01-29 | AJ_8As8YVjvbSP2 | RM_8Vi9SAaw9bPb | TCN
2026-01-23 | AJ_aJdzGbmjGSjJ | RM_fxTBN2HdgiWD | TCN
2026-01-12 | AJ_HRXkAVJJGjZg | RM_NrLqAFpnZghB | TCN
2026-01-09 | AJ_toohMYkYky5w | RM_m7FRVEEpXb6H | TCN
2026-01-08 | AJ_Q6wXWJt2cysu | RM_ugZKppSeJ6kt | TCN
2026-01-05 | AJ_saxUGSUn582i | RM_VBwD4aEFvjeW | TCN
2025-12-30 | AJ_7stG8Zwipi3d | RM_UYLyTqfXguKo | TCN
#### Case B - `signal_event taking too much time` + forced kill (STATE_MISMATCH on agent participant unless noted)
| Date | Job ID | Room ID | Trunk Provider | Agent `STATE_MISMATCH`? | Notes |
|---|---|---|---|---|---|
| 2026-01-30 | AJ_vau2WWRWuKjG | RM_RWpxP322Ynvs | TCN | Yes | |
| 2026-01-30 | AJ_FxykEihmLkRZ | RM_xwGg6ycTR5mM | TCN | Yes | |
| 2026-01-30 | AJ_pGHoG2R4rsGh | RM_H35AZSeWg7Rw | TCN | Yes | |
| 2026-01-30 | AJ_65HkxUK6sN5C | RM_uwV8tvyThkCh | TCN | Yes | |
| 2026-01-30 | AJ_prtvEq4PmyfR | RM_CszADLmSGjYG | TCN | No | Process killed with log `signal_event taking too much time` |
| 2026-01-30 | AJ_Fcz7bngRtZeT | RM_NWiZ4dYXzoop | TCN | Yes | |
| 2026-01-28 | AJ_83dBFfUka3X8 | RM_zhSMDQisi85x | TCN | Yes | |
| 2026-01-22 | AJ_Yq94XDEtxGHy | RM_BNogVezdPxUR | Twilio | Yes | |
| 2026-01-05 | AJ_zGnGbJRYqisF | RM_cf6Zih5UK2k3 | TCN | No | Process killed with log `signal_event taking too much time` |
| 2026-01-05 | AJ_KRSHaoG93jh9 | RM_t2DUGMJvgwmq | TCN | Yes | |
#### Case B.2 — hang (not auto-killed; requires manual intervention)
| Date | Job ID | Room ID | Trunk Provider | Notes |
|---|---|---|---|---|
| 2026-02-04 | AJ_ak4H8xArtcWr | RM_qVJj2MFdhfo6 | TCN | Process hangs indefinitely; requires manual kill |
Logs for Case B.2:
```json
{"message": "livekit::rtc_engine:474:livekit::rtc_engine - received session close: \"server request to leave\" Migration Resume", "level": "WARNING", "name": "livekit", "worker_id": "AW_FoKfHKbkVaAq", "call_id": "call_AJ_ak4H8xArtcWr", "pid": 186848, "job_id": "AJ_ak4H8xArtcWr", "room_id": "RM_qVJj2MFdhfo6", "timestamp": "2026-02-03T20:07:47.178137+00:00"}
{"message": "livekit::rtc_engine:474:livekit::rtc_engine - received session close: \"signal client closed: \\\"stream closed\\\"\" UnknownReason Resume", "level": "WARNING", "name": "livekit", "worker_id": "AW_FoKfHKbkVaAq", "pid": 186848, "job_id": "AJ_ak4H8xArtcWr", "room_id": "RM_qVJj2MFdhfo6", "timestamp": "2026-02-03T20:07:47.178858+00:00"}
{"message": "livekit::rtc_engine:773:livekit::rtc_engine - resuming connection... attempt: 0", "level": "ERROR", "name": "livekit", "pid": 186848, "job_id": "AJ_ak4H8xArtcWr", "room_id": "RM_qVJj2MFdhfo6", "timestamp": "2026-02-03T20:07:47.179020+00:00"}
{"message": "livekit::rtc_engine::rtc_session:715:livekit::rtc_engine::rtc_session - signal_event taking too much time: Answer(SessionDescription { r#type: \"answer\", sdp: \"v=0\\r\\no=- 7106784265579657634 1770149268 IN IP4 0.0.0.0\\r\\ns=-\\r\\nt=0 0\\r\\na=msid-semantic:WMS *\\r\\na=fingerprint:sha-256 49:6D:BC:DF:5C:CC:07:18:6D:4A:9D:76:71:27:EF:74:AC:51:43:8B:9A:8A:4D:DC:B0:46:8C:79:A5:1A:FF:C9\\r\\na=ice-lite\\r\\na=extmap-allow-mixed\\r\\na=group:BUNDLE 0 1\\r\\nm=audio 9 UDP/TLS/RTP/SAVPF 63 111 0 8\\r\\nc=IN IP4 0.0.0.0\\r\\na=setup:active\\r\\na=mid:0\\r\\na=ice-ufrag:NoojDQQFKcpybZYn\\r\\na=ice-pwd:YtZlOyYwiYbhwLCAoJGjulBjJIMPwjqw\\r\\na=rtcp-mux\\r\\na=rtcp-rsize\\r\\na=rtpmap:63 red/48000/2\\r\\na=fmtp:63 111/111\\r\\na=rtpmap:111 opus/48000/2\\r\\na=fmtp:111 minptime=10;useinbandfec=1;usedtx=1\\r\\na=rtpmap:0 PCMU/8000\\r\\na=rtpmap:8 PCMA/8000\\r\\na=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid\\r\\na=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level\\r\\na=recvonly\\r\\nm=application 9 UDP/DTLS/SCTP webrtc-datachannel\\r\\nc=IN IP4 0.0.0.0\\r\\na=setup:active\\r\\na=mid:1\\r\\na=sendrecv\\r\\na=sctp-port:5000\\r\\na=max-message-size:65535\\r\\na=ice-ufrag:NoojDQQFKcpybZYn\\r\\na=ice-pwd:YtZlOyYwiYbhwLCAoJGjulBjJIMPwjqw\\r\\n\", id: 0, mid_to_track_id: {} })", "level": "ERROR", "name": "livekit", "worker_id": "AW_FoKfHKbkVaAq", "pid": 186848, "job_id": "AJ_ak4H8xArtcWr", "room_id": "RM_qVJj2MFdhfo6", "timestamp": "2026-02-03T20:07:58.256015+00:00"}
{"message": "livekit::rtc_engine::rtc_session:1041:livekit::rtc_engine::rtc_session - Subscriber pc state failed", "level": "ERROR", "name": "livekit","worker_id": "AW_FoKfHKbkVaAq", "pid": 186848, "job_id": "AJ_ak4H8xArtcWr", "room_id": "RM_qVJj2MFdhfo6", "timestamp": "2026-02-03T20:08:07.207458+00:00"}
{"message": "livekit::rtc_engine:474:livekit::rtc_engine - received session close: \"pc_state failed\" UnknownReason Resume", "level": "WARNING", "name": "livekit" "worker_id": "AW_FoKfHKbkVaAq", "pid": 186848, "job_id": "AJ_ak4H8xArtcWr", "room_id": "RM_qVJj2MFdhfo6", "timestamp": "2026-02-03T20:08:07.208092+00:00"}
{"message": "livekit::rtc_engine:773:livekit::rtc_engine - resuming connection... attempt: 0", "level": "ERROR", "name": "livekit", "worker_id": "AW_FoKfHKbkVaAq", "pid": 186848, "job_id": "AJ_ak4H8xArtcWr", "room_id": "RM_qVJj2MFdhfo6", "timestamp": "2026-02-03T20:08:07.208260+00:00"}
{"message": "livekit::rtc_engine::rtc_session:870:livekit::rtc_engine::rtc_session - Wrong packet sequence while retrying: 135 > 128, 7 packets missing", "level": "WARNING", "name": "livekit", "worker_id": "AW_FoKfHKbkVaAq", "pid": 186848, "job_id": "AJ_ak4H8xArtcWr", "room_id": "RM_qVJj2MFdhfo6", "timestamp": "2026-02-03T20:08:07.309522+00:00"}
---below logs are generated every 5 mins until process is killed manually---
{"message": "livekit::rtc_engine::rtc_session:1041:livekit::rtc_engine::rtc_session - Publisher pc state failed", "level": "ERROR", "name": "livekit", "worker_id": "AW_FoKfHKbkVaAq", "pid": 186848, "job_id": "AJ_ak4H8xArtcWr", "room_id": "RM_qVJj2MFdhfo6", "timestamp": "2026-02-03T20:11:41.134958+00:00"}
{"message": "livekit::rtc_engine:474:livekit::rtc_engine - received session close: \"pc_state failed\" UnknownReason Resume", "level": "WARNING", "name": "livekit", "worker_id": "AW_FoKfHKbkVaAq", "pid": 186848, "job_id": "AJ_ak4H8xArtcWr", "room_id": "RM_qVJj2MFdhfo6", "timestamp": "2026-02-03T20:11:41.135196+00:00"}
```
### Proposed Solution
```python
NA
```
### Additional Context
- This happens intermittently (production only so far); we don't have a deterministic repro.
- Trunk providers: mostly TCN; one occurrence via Twilio.
- In Analytics the SIP identity often appears twice in the same room (leave + immediate rejoin) around the time we see "Migration Resume".
- In the stuck cases, `process did not exit in time, killing process` happens ~60-90s after `signal_event taking too much time`.
### Screenshots and Recordings
_No response_