How Does LiveKit Route Agent Jobs Across Multiple EC2 Instances and Support Autoscaling?

Hi everyone,

I currently have a LiveKit Agent running on a single AWS EC2 instance. I’m planning to perform load testing to understand how the system behaves when a large number of users connect simultaneously.

My goal is to horizontally scale the agent service by running multiple EC2 instances (for example, 3–4 instances) with the same LiveKit configuration and credentials.

I have a few questions:

  1. If multiple agent instances are running, how are incoming requests or agent jobs distributed among them?

  2. Does LiveKit automatically route jobs to an available/free agent instance, or is additional configuration required?

  3. What is the recommended architecture for autoscaling LiveKit Agents on AWS based on load (CPU, memory, active sessions, etc.)?

Any guidance or best practices for load testing and autoscaling LiveKit Agents would be greatly appreciated.

Thanks!

Hi, the relevant section of the docs is Self-hosted deployments | LiveKit Documentation, specifically Self-hosted deployments | LiveKit Documentation and the sections below it. The docs explain load balancing better than I can… the most important bit is to define the load_fnc and load_threshold correctly for your environment.