Livekit Agent Dispatch issue, hosted on livekit cloud

Hey team
I’ve been running a deployed agent on LiveKit Cloud for several months without issues. About 2–3 weeks ago it stopped working completely, and after extensive debugging I can confirm it’s not a code or config problem.
Details:
Agent ID: CA_kdH3VYVKtzsr
Project ID: p_5wnpagm7zno
Region: us-east
Plan: Build (free)
Agent: Python SDK livekit-agents==1.5.1, explicit dispatch with agent_name=“outbound-caller”

Dispatched from: .NET backend via AgentDispatchServiceClient
What’s happening:

lk agent status always shows Sleeping with CPU: 0m, Mem: 0
AgentDispatchServiceClient.CreateDispatch() returns a valid AD_ ID with no errors

The agent never joins the room — dispatch is orphaned
lk agent restart returns “Restarted” but status stays Sleeping immediately after
Runtime logs are inaccessible: “The agent has shut down due to inactivity. It will automatically start again when a new session begins.” — but it never does

Build logs are completely clean — all 9 Docker steps succeed, no errors
Confirmed NOT the issue:
All secrets present and valid (OPENAI_API_KEY, DEEPGRAM_API_KEY, ASSEMBLYAI_API_KEY, SIP_OUTBOUND_TRUNK_ID, AWS keys)
LIVEKIT_URL/API_KEY/SECRET are project-level — not needed in secrets
Free plan quota not exceeded
Room stays open for several minutes after dispatch (not an emptyTimeout issue)

Dispatch API works correctly — valid AD_ IDs returned every time
Self-hosted works perfectly:
When I run the exact same agent code locally with python agent.py dev pointing to the same LiveKit Cloud project, it works flawlessly — registers as a worker, receives the dispatch, joins the room, completes the full session including DTMF, STT, TTS, S3 upload. Zero issues.
So the code is correct. The cloud deployment of the same code is broken.
My conclusion: The deployment CA_kdH3VYVKtzsr appears stuck in a broken infrastructure state where the cold start mechanism silently fails — the container is triggered to wake up on dispatch but never actually boots, never registers as a worker, and the pending dispatch expires with nobody to claim it.
This started happening ~2–3 weeks ago with no changes on my end.
Questions:

Can you investigate what’s happening with this specific deployment on your infrastructure?
Is there a known issue with cold start + explicit dispatch on the Build plan recently?
Would deleting and redeploying fix this, or is it a broader platform issue?
Thanks :pray:

I don’t see any similar issues that could indicate a broader platform issue, but I notice you have rebuilt the agent today and are seeing some sessions with the agent joining. Is this still an issue ?

Agent: Python SDK livekit-agents==1.5.1, explicit dispatch with agent_name=“outbound-caller”

Just to clarify, the agent with ID CA_kdH3VYVKtzsr has no name if you look at the analytics, Sign in | LiveKit Cloud but if you look at the list of agents at Sign in | LiveKit Cloud I DO see outbound-caller, so quite possibly something did get confused, and deleting / recreating the agent will resolve your issue.

I did redeploy yesterday and the initial status of the agent is “running“ as it should be after deployment, and during the status was running, agent dispatch worked, but right after the status went pending, dispatch wasn’t working, but this morning it worked I can’t seem to figure out what is the problem. Soon we’re planning to move on to the paid plan, on Ship plan and above the worker stays running 24*7 correct?
But anyways for now the problem is resolved, thanks for the help @darryncampbell .

The best and most reliable indicator of agent state is to use the LiveKit CLI, and you will see the different statuses your agent can return here: Agent commands | LiveKit Documentation

The dashboard will amalgamate these statuses into either ‘running’, ‘pending’ or ‘error’.

Running corresponds to the running state at Agent commands | LiveKit Documentation

Error corresponds to the error states at Agent commands | LiveKit Documentation

Pending corresponds to anything else.

Soon we’re planning to move on to the paid plan, on Ship plan and above the worker stays running 24*7 correct?

You were seeing Pending because your agent was in the Sleeping state since it had scaled down to 0 active instances, as documented here: Agent commands | LiveKit Documentation. So, yes, after you upgrade to the Ship plan I would not expect to see this.