Livekit server taking longer time to cool down

As a part of testing I ran 25 agents in parllell, after completion of the sessions the server is still using 2 cores for 10 minutes.

@ABHIRAM_SAI_GANESH_VARREY, Too thin to diagnose, but two known causes for “post-session CPU stays high” on the agents worker:

  • MCP cleanup leak (AgentSession.aclose() leaves the event loop hot when adding MCPs · Issue #5212 · livekit/agents · GitHub). AgentSession.aclose() left the event loop hot when MCPs were attached. Fixed by PR #5223, merged ~2026-03-25. If you’re on an older livekit-agents and using MCP tools, upgrade first. With num_idle_processes=0 the leak is masked because the child is torn down entirely.
  • Prewarmed workers holding model state. Silero VAD and the turn-detector model stay resident in idle prewarm processes for fast job pickup. 25 prewarms times those models is real per-core load even between sessions. num_idle_processes (default keeps several warm) is the lever.

Helpful to share: livekit-agents version, whether MCPs are in play, and a py-spy dump --pid <pid> of one of the busy processes. That distinguishes the two paths quickly.

Hey, It’s post session cpu of livekit server , not the worker.

@ABHIRAM_SAI_GANESH_VARREY, Apologies, completely misread “server” as the agents worker. The SFU side is a different surface. Three candidates for “2 cores for 10 minutes”:

  • empty_timeout on rooms. Default is typically 300s in self-host (livekit.yaml), but if you’ve set higher (or set per-room via RoomConfiguration.empty_timeout), the SFU keeps rooms alive until expiry. 25 rooms × N minutes = sustained bookkeeping.
  • Webhook delivery retries. If room_finished / participant_left webhooks are configured and your endpoint is slow or timing out, the SFU retries through backoff. 25 concurrent teardowns × retries = real CPU. Check webhook endpoint latency.
  • Egress finalization spillover. If any session used egress, the egress process is still finalizing MP4 + uploading post-session. Check lk egress list for active jobs in that window.

pprof on livekit-server (/debug/pprof/profile) pinpoints which subsystem is hot.