Multiple Unstable Connection Errors During Active Sessions — Losing Confidence in Production Reliability

Hi LiveKit Team,

We have been implementing LiveKit in our product and are experiencing frequent, unpredictable errors during active sessions. These errors occur randomly — sometimes after just 2 minutes of a session starting — and are seriously impacting our confidence in using LiveKit in production.

Errors we are seeing:

1. Browser console:

WebSocket is already in CLOSING or CLOSED state.

2. LiveKit agent logs:

unhandled websocket message Err(Io(Os { code: 54, kind: ConnectionReset, message: "Connection reset by peer" }))

received session close: "signal client closed: stream closed" UnknownReason Resume

resuming connection... attempt: 0

error running user callback for local_track_subscribed:
asyncio.exceptions.InvalidStateError: invalid state

lpublication._first_subscription.set_result(None)
asyncio.exceptions.InvalidStateError: invalid state

What we observe:

  • Session starts normally

  • After approximately 2 minutes, the WebSocket connection drops with “Connection reset by peer”

What we need help with:

  1. Why is the WebSocket connection resetting after ~2 minutes?

  2. Why is _first_subscription.set_result(None) throwing InvalidStateError — is this a known bug in the SDK?

  3. Is there a recommended way to handle connection resumption more reliably?

  4. Are there any known stability issues with the current Python SDK version we should be aware of?

We are currently evaluating LiveKit for a production use case and these random disconnections are blocking us from moving forward with confidence. Any guidance or fixes would be greatly appreciated.

Room ID for reference: RM_heFNBUgtcTfj Job ID: AJ_8nL76sPJ5af7

Thank you for your support.

What do you see in your agent logs?

Room : RM_heFNBUgtcTfj