Best way to scale LiveKit Egress for recordings (private meetings + livestream platform)?

Pitchers · March 12, 2026, 9:33am

Hi everyone,

I’m building a live streaming + private meeting platform and looking for some architecture advice around scaling LiveKit egress.

Current stack

Angular frontend
Flutter Mobile
.NET backend
Self-hosted LiveKit server running on an Ubuntu EC2 instance
Redis for coordination
AWS infrastructure (EC2 / containerized services)

Recording use cases

The platform supports two types of sessions:

Private meetings → using RoomComposite Egress
Livestream classes → using Participant Egress (record only instructor)

Recording is optional and triggered by the instructor, so demand can vary a lot. For example, multiple instructors might start recordings at the same time.

The problem I’m trying to solve

Right now I haven’t implemented autoscaling yet, and I’m trying to design the right architecture before moving forward.

My concern is how to handle situations where many recordings start at once. Since egress workers handle recording jobs, I want to avoid cases where requests fail or timeout due to lack of capacity.

What I’m trying to achieve

Ideally the system should:

Scale egress workers automatically when recording demand increases
Scale down when idle to save infrastructure cost
Handle bursts where many recordings start simultaneously
Support both RoomComposite and Participant egress jobs efficiently

Questions

For anyone running LiveKit in production:

What is the recommended way to scale LiveKit egress workers?
Should scaling be based on:
- CPU usage
- number of active recordings
- pending egress jobs
- pipelines per worker
Has anyone implemented autoscaling egress workers successfully on AWS (ECS / EC2 / Kubernetes)?
If LiveKit server load increases (many rooms), how do you typically scale the LiveKit media servers alongside egress workers?

I’m still in the architecture design stage, so any suggestions, reference architectures, or lessons learned would be really helpful.

Thanks!

Raghu_Udiyar · March 13, 2026, 6:59am

Hi, We have implemented autoscaling for the entire stack - livekit has good instrumentation and architecture to be able to do so.

First size the servers and determins what is the peak workload for the hardware you are using. Then use that metric to autoscale the cluster, while keeping some headroom, depending on the concurrency that you expect. Egress exposes livekit_egress_requests which you can use to determine active egress instances running on an instance.

To make it even more reslient, on the client, you can have a retry based logic to wait and keep retrying until the egress request gets accepted.

Do the same for livekit media servers as well - how many rooms it can accomodate, and autoscale accordingly.

Topic		Replies	Views
Autoscaling Strategy for Self-Hosted LiveKit Egress Workers in Real-Time Streaming Platform Self Hosting egress	3	94	March 10, 2026
Self-hosting only the Egress service while using LiveKit Cloud — has anyone tried this? Getting Started	4	54	April 21, 2026
Is self-hosting LiveKit cost-effective at scale Self Hosting	1	207	March 4, 2026
Egress service gets stuck Self Hosting egress	3	84	February 15, 2026
How Does LiveKit Route Agent Jobs Across Multiple EC2 Instances and Support Autoscaling? Getting Started agent-deployment , livekit-cloud	1	35	June 3, 2026