Increase maximum number of concurrency

What is causing the spike in my call processing?

I’m self-hosting LiveKit for development, and currently I can only handle around 15 concurrent calls. However, my CPU usage is only about 40%, and memory usage is still low.

The issue is that telephony stops dispatching new calls when there’s a spike in load.

I’ve already moved my model initialization into a prewarm phase, so I don’t think cold starts are the problem.

I’m trying to understand what is causing this spike and limiting my concurrency despite low resource usage.

Here I attach attachment and code snippet:

Performance

Code Snippet

def compute_load(agent_server: Any) -> float:
    # ---- HARD MEMORY LIMIT (SAFETY ONLY) ----
    mem_used_gb = psutil.virtual_memory().used / (1024**3)
    if mem_used_gb >= AGENT_MAX_MEMORY_GB:
        return 0.95

    pressures: list[float] = []

    # ---- PRIMARY: CCR ----
    ccr = len(agent_server.active_jobs) / AGENT_MAX_CONCURRENT_CALLS
    pressures.append(min(ccr, 1.0))

    # ---- CPU (LiveKit default moving average) ----
    try:
        cpu_percent = _DefaultLoadCalc.get_load(agent_server)
    except Exception:
        cpu_percent = float("nan")

    if cpu_percent >= AGENT_MAX_CPU_PERCENT:
        return 0.9

    cpu_pressure = cpu_percent / AGENT_MAX_CPU_PERCENT
    pressures.append(min(cpu_pressure, 1.0))

    load = max(pressures)

    return load

def prewarm(proc: JobProcess) -> None:
   otel_manager = OTELManager()
   otel_manager.initialize()

   rest = RestClient()
   gql = GraphQLClient()
   intent_clf = joblib.load(intent_model_path)
   proc.userdata["vad"] = silero.VAD.load()
   proc.userdata["nc_models"] = {
      "telephony": nc.BVCTelephony(),
      "bvc": nc.BVC(),
      "nc": nc.NC(),
   }
   setup_grpc()

Are you only self hosting agent or are you self hosting server too?

If you are using LiveKit Cloud it would be helpful to get some session ids so I can look at the backend logs to see if anything was wrong there.

If this is also self hosted server then check your server logs and see if you can find what may have happened.