Signal connection times out on the "v0 path" at agent join, forcing a fallback that adds 0.5–5s of call-setup latency

@CWilson - continuing our conversation of signal connection timeout in this thread.

We now have good headroom for our agent pods along with autoscaling to serve the current traffic without delays - you should notice lesser worker is at full capacity and zero failures of maximum attempts reached. This is just FYI.

Coming to the signaling connection issue - as per your analysis is a networking issue and NOT a capacity issue. I digged through all our vpcflowlogs from our pod - no packets had dropped and don’t see port exhaustion errors on our NAT gateway.

The below is the log we are still seeing for the below rooms:
livekit_api::signal_client:287 - signal connection failed on v0 path: Timeout("signal connection timed out")

All 13 occurrences (project p_3tqm7ro6kbs):

  ┌────────────┬─────────────────┬─────────────────┐
  │ Time (UTC) │     Room ID     │     Worker      │
  ├────────────┼─────────────────┼─────────────────┤
  │ 01:59:44   │ RM_h6BiM7yDYd2B │ AW_8vLrg6NX7uMv │
  ├────────────┼─────────────────┼─────────────────┤
  │ 16:08:11   │ RM_YoTqjYV6L2W7 │ AW_yLX7Bv8Bd432 │
  ├────────────┼─────────────────┼─────────────────┤
  │ 16:08:14   │ RM_pyfSxbSMUNtr │ AW_dVcPdmxwMBnJ │
  ├────────────┼─────────────────┼─────────────────┤
  │ 16:11:43   │ RM_viqZWyBwprUv │ AW_tU9QX4DdqtWU │
  ├────────────┼─────────────────┼─────────────────┤
  │ 16:12:10   │ RM_hYLfqi3N2cLz │ AW_U8JhQadVJuHB │
  ├────────────┼─────────────────┼─────────────────┤
  │ 16:13:31   │ RM_fmWoHHuddG3Y │ AW_dVcPdmxwMBnJ │
  ├────────────┼─────────────────┼─────────────────┤
  │ 16:14:39   │ RM_Hf4is9dcsqfZ │ AW_yLX7Bv8Bd432 │
  ├────────────┼─────────────────┼─────────────────┤
  │ 16:15:33   │ RM_5w8mMZmsThaV │ AW_yLX7Bv8Bd432 │
  ├────────────┼─────────────────┼─────────────────┤
  │ 16:18:00   │ RM_RZy6ZSd35Ttt │ AW_U8JhQadVJuHB │
  ├────────────┼─────────────────┼─────────────────┤
  │ 16:18:02   │ RM_RqxqgzSMGZxX │ AW_tU9QX4DdqtWU │
  ├────────────┼─────────────────┼─────────────────┤
  │ 17:28:56   │ RM_tMvrdWvsELkC │ AW_tU9QX4DdqtWU │
  ├────────────┼─────────────────┼─────────────────┤
  │ 17:29:10   │ RM_CUcFx3ZTWDK7 │ AW_tU9QX4DdqtWU │
  ├────────────┼─────────────────┼─────────────────┤
  │ 17:29:15   │ RM_AXFFcirDv3NB │ AW_yLX7Bv8Bd432 │
  └────────────┴─────────────────┴─────────────────┘

Can you please check what happened here :folded_hands: