Cloud-deployed agent never connects

Hi everyone,

I’m stuck getting a Python voice agent to run on LiveKit Cloud.

Local dev works; cloud deployment silently fails.


Goal / expected
Deploy the same voice agent to LiveKit Cloud and have it auto-join rooms created by my React front-end or the Agent Playground.


Actual behaviour

  • Agent Playground shows spinner forever.

  • My front-end shows no error in console / network tabs and the agent never appears.

  • Cloud dashboard → Agents tab shows the deployment as “Running” with concurrent agent sessions that seems to go up in number event thou Sessions tab show that all sessions have ended.

  • I can’t find proper logs but the dashboard doesn’t show errors.


Environment

  • livekit-agents ~= 1.3
  • livekit-plugins-noise-cancellation ~= 0.2
  • Python 3.12
  • lk version 2.12.9

Any ideas how to debug further or what I missed?

Thanks a million!

Hi @Sal_Hasan. This issue usually means the agent is accepting jobs but crashing before it actually joins the room. Most likely cause is an issue with the environment, maybe an env var missing(like API keys) that exist in your local .env but weren’t set in the cloud deployment.

Can you check the agent logs in the Cloud dashboard (Agents → your agent → Logs)? If there’s nothing useful there, try adding more logs and redeploying with LOG_LEVEL=DEBUG and share what you see, that should pinpoint exactly where it’s failing

Hello @cdutr,

Thanks for the suggestion! I added detailed logs and redeployed with LOG_LEVEL=DEBUG.

From the Cloud logs, the agent appears to start successfully with no crash/traceback. The logs also don’t show any room connection or anything like logs in local dev.

  • worker starts
  • plugins preload
  • process initializes
  • worker registers successfully (registered worker, protocol 16, region shown)

I also log an environment snapshot at startup, and all expected keys are present in cloud.

So at least from startup logs, I’m not seeing an env-var-missing issue or early process crash.

I also added on_session_end for observability & made sure it’s enabled in project settings, but Session > Agent Insights is still empty.

What can I do next to try to resolve the issue?

I ran into something similar, deploying agents that worked locally but went silent in the cloud.

The pattern you’re describing is worker registers, concurrency climbs, but no room join. This usually means the job is accepted, but the entrypoint never reaches ctx.connect().

In my case, the root cause wasn’t LiveKit itself; it was plugin initialization happening before the room connection. If a native plugin fails (CPU flags, missing shared libs, container differences), it can fail quietly, and the job never reaches the connect step.

A few things I’d try:

  1. Move await ctx.connect() to the very start of the entrypoint and log immediately before/after it. Make the room join explicit and observable.

  2. Temporarily remove livekit-plugins-noise-cancellation as a diagnostic. If agents start connecting to the cloud immediately, that’s likely the culprit.

  3. Wrap plugin initialization in a try/except and log the failure explicitly. If the plugin fails, allow the session to continue without it rather than silently stalling.

When concurrency keeps increasing without sessions closing, that’s usually a sign that the entrypoint is crashing or hanging before the room lifecycle begins.

Curious if removing the noise cancellation changes the behavior.

2 Likes

God bless you kind stranger.

I removed the noise cancellation plugin, this allowed it to run and allowed me see the console logs.

I upgraded livekit to the latest version then re-added noise cancellation plugin and now it works perfectly.