Increase maximum number of concurrency

Muhammad_Ravi · March 26, 2026, 5:10am

What is causing the spike in my call processing?

I’m self-hosting LiveKit for development, and currently I can only handle around 15 concurrent calls. However, my CPU usage is only about 40%, and memory usage is still low.

The issue is that telephony stops dispatching new calls when there’s a spike in load.

I’ve already moved my model initialization into a prewarm phase, so I don’t think cold starts are the problem.

I’m trying to understand what is causing this spike and limiting my concurrency despite low resource usage.

Here I attach attachment and code snippet:

Performance

Code Snippet

def compute_load(agent_server: Any) -> float:
    # ---- HARD MEMORY LIMIT (SAFETY ONLY) ----
    mem_used_gb = psutil.virtual_memory().used / (1024**3)
    if mem_used_gb >= AGENT_MAX_MEMORY_GB:
        return 0.95

    pressures: list[float] = []

    # ---- PRIMARY: CCR ----
    ccr = len(agent_server.active_jobs) / AGENT_MAX_CONCURRENT_CALLS
    pressures.append(min(ccr, 1.0))

    # ---- CPU (LiveKit default moving average) ----
    try:
        cpu_percent = _DefaultLoadCalc.get_load(agent_server)
    except Exception:
        cpu_percent = float("nan")

    if cpu_percent >= AGENT_MAX_CPU_PERCENT:
        return 0.9

    cpu_pressure = cpu_percent / AGENT_MAX_CPU_PERCENT
    pressures.append(min(cpu_pressure, 1.0))

    load = max(pressures)

    return load

def prewarm(proc: JobProcess) -> None:
   otel_manager = OTELManager()
   otel_manager.initialize()

   rest = RestClient()
   gql = GraphQLClient()
   intent_clf = joblib.load(intent_model_path)
   proc.userdata["vad"] = silero.VAD.load()
   proc.userdata["nc_models"] = {
      "telephony": nc.BVCTelephony(),
      "bvc": nc.BVC(),
      "nc": nc.NC(),
   }
   setup_grpc()

CWilson · March 26, 2026, 11:53am

Are you only self hosting agent or are you self hosting server too?

If you are using LiveKit Cloud it would be helpful to get some session ids so I can look at the backend logs to see if anything was wrong there.

If this is also self hosted server then check your server logs and see if you can find what may have happened.

Topic		Replies	Views
Does the 5 concurrency limit apply to self-hosted agents? Cloud Dashboard agent-deployment	1	37	January 21, 2026
What are the concurrent call limits for LiveKit agents? Agents agent-deployment	1	85	January 21, 2026
Process memory usage is high (nodejs livekit agents) (using cloud livekit) Getting Started	3	28	April 8, 2026
Agent not joining sessions Telephony sip-trunking	5	25	April 10, 2026
Coming from VAPI - concurrent calls question Cloud Dashboard other	4	63	March 24, 2026

Increase maximum number of concurrency

Performance

Code Snippet

Related topics