When I use docker, agent fails to join after second time

Hey folks,

I am running a self hosted agent on docker and it starts failing to join after first time. Why could it be? If I run the same code using make, it works as it is.

Best

Hi, you would need to check the agent logs for any useful information, Where to find your Agent Logs | LiveKit. Note that you can’t use the ‘logs’ tab in agent observability for this, as the session is never started.

You might also find this guide helpful: Python Agents: Handling exceptions in your entrypoint | LiveKit

Hi there! I am only having those:

{“message”: “received job request”, “level”: “INFO”, “name”: “livekit.agents”, “job_id”: “AJ_oiMf3BMKNcvY”, “dispatch_id”: “AD_wLaTiD6Xazz2”, “room”: “session_f8189352-03b5-4dd2-956f-b8bbfd37cef1”, “room_id”: “RM_toziYGnUUkWe”, “agent_name”: “agent”, “resuming”: false, “enable_recording”: false, “timestamp”: “2026-04-06T09:41:53.771468+00:00”}
{“message”: “initializing process”, “level”: “INFO”, “name”: “livekit.agents”, “pid”: 250, “timestamp”: “2026-04-06T09:41:53.978510+00:00”}
{“message”: “skipping user input, speech scheduling is paused”, “level”: “WARNING”, “name”: “livekit.agents”, “user_input”: “”, “pid”: 67, “job_id”: “AJ_oiMf3BMKNcvY”, “room_id”: “RM_toziYGnUUkWe”, “timestamp”: “2026-04-06T09:41:56.088817+00:00”}
{“message”: “process exiting”, “level”: “INFO”, “name”: “livekit.agents”, “reason”: “”, “pid”: 67, “job_id”: “AJ_oiMf3BMKNcvY”, “room_id”: “RM_toziYGnUUkWe”, “timestamp”: “2026-04-06T09:41:56.092032+00:00”}
{“message”: “process initialized”, “level”: “INFO”, “name”: “livekit.agents”, “pid”: 250, “elapsed_time”: 2.18, “timestamp”: “2026-04-06T09:41:56.158420+00:00”}

Your logs show the job is accepted and a worker process starts, but then you see “process exiting” almost immediately. In LiveKit, each job runs in a separate process and it ends when your entrypoint returns or the session shuts down, as described in the Job lifecycle.

This usually means:

  • Your entrypoint finishes without blocking (e.g., no await ctx.connect() or no wait_for_participant()).

  • An exception is raised early and the process exits.

  • You explicitly call shutdown() or close the session.

When running via make, you may be using a different startup mode or env vars than Docker. Are you calling await ctx.connect() and then waiting for a participant before returning from the entrypoint?

I use a one shot agent and need to make agent left the room. I call session shutdown and job context shutdown to make this. Yet, when I redispatch the agent, it does not wait for the job. Is that expected?

Makefile uses dev mode and docker uses start prod mode. The rest is the same.

Are you trying to redispatch the agent to the same room? In your use case, if you are having a single participant - agent interaction, this should take place in a new room each time.

Yes, we dispatch to same room. we have many participant and they listen the speech from the agent and talking user. The interaction is one-to-one but all room listen as required.

I see.

Makefile uses dev mode and docker uses start prod mode. The rest is the same.

If it were me I’d look at two things next:

  • Align the start mode between Makefile and docker, to see if that is the issue (though I don’t expect it would be to be honest)
  • Enable debug logs, to see if there is any more log information during agent shutdown, Where to find your Agent Logs | LiveKit . And add additional logs to your entrypoint

The dev startup works correctly on docker but on start(prod), it fails. There is no useful log on debug either. The second time fails unless we wait 2-3 mins before readding

I’m taking a closer look at RM_toziYGnUUkWe, since you said that’s where the issue was presenting itself.

If I look at the event log at the bottom of your session analytics (on the dashboard), I only see a single participant in the whole session, 81d03cc3-ae65-4fb4-b095-4dc8a4e3b2ea

Regardless though, the logs show that the agent itself is leaving the call (client_initiated) and I don’t see anything more revealing if I look at the server logs for this session - it looks like the agent itself is leaving the room. Dev mode is designed for development and behaves slightly differently, Server startup modes | LiveKit Documentation, so perhaps running in Dev mode is masking some underlying issue.

I don’t really understand why you are re-dispatching an agent back to the same room, why not just keep the agent in that rooom, that would avoid the issue entirely :slight_smile:

Those are development branches that I create locally to match production use case. It was just designing to be a one-shot agent.

My understanding of a “one-shot agent” would be to have separate rooms for each interaction with the agent, not have the participants and agent try to join and reuse the same room.