Autoscaling Strategy for Self-Hosted LiveKit Egress Workers in Real-Time Streaming Platform

Puvvalla_Satish · March 10, 2026, 7:19am

Hi Team,

We are building a real-time streaming and transcription platform using a self-hosted LiveKit stack. Our architecture currently includes:

LiveKit Server (self-hosted)
Egress service for recording/streaming outputs
Transcription agent for real-time speech-to-text
Redis for coordination
Containerized deployment (ECS/Fargate / Kubernetes style cluster)

Current Problem

In our current setup:

Each egress request is assigned to a single egress worker.
If no worker is available, the request stays pending and eventually times out.
During high traffic (many rooms starting recordings simultaneously), we see egress capacity bottlenecks.

So effectively:

Room Recording Request
      ↓
LiveKit Server
      ↓
Egress Service
      ↓
Available Worker ?
   YES → Start recording
   NO  → Request timeout

Goal

We want to design a reliable autoscaling strategy so that:

Egress workers scale automatically based on demand.
Recording requests do not timeout during bursts.
Workers scale down when idle to save cost.

Questions

What is the recommended autoscaling strategy for self-hosted LiveKit egress clusters?
Should autoscaling be based on:
- CPU usage
- Memory usage
- Number of pending egress jobs
- Active room recordings?
Is there a way to queue egress jobs when workers are unavailable instead of failing immediately?
Has anyone implemented horizontal autoscaling for egress workers successfully (Kubernetes / ECS)?
Any recommended metrics to monitor for egress scaling (e.g., active pipelines, ffmpeg processes, Redis state)?

Raghu_Udiyar · March 10, 2026, 8:28am

Use the metrics exposed by egress to identify how many are running - its available as livekit_egress_requests, and then autoscale using this metric. For the client, you could have a retry based logic to wait and keep retrying until the egress request gets accepted.

You should benchmark how many egress requests a single instance handle, and accordingly use that as the threshold for autoscaling.

Puvvalla_Satish · March 10, 2026, 8:50am

Hi @Raghu_Udiyar ,

Is there any SDK-level support in LiveKit to implement retry-based logic for starting egress until a worker becomes available?

Currently our setup is:

LiveKit server (self-hosted)
LiveKit Egress running on AWS ECS EC2 launch mode
EC2 instances behind an Auto Scaling Group

The issue we are seeing is:

When a new egress request arrives and no egress worker is immediately available, the request eventually times out. If the Auto Scaling Group launches a new EC2 instance to run the egress worker, it takes significant time (instance startup + container startup), so the original request fails before capacity becomes available.

What we are trying to understand:

Does the LiveKit SDK provide built-in retry logic for egress start requests until a worker becomes available?
Is there any recommended pattern for handling this scenario in production?
Apart from keeping idle standby workers, are there any other strategies used by the community to handle burst egress workloads?

Our current approach is considering implementing application-level retry with backoff, but wanted to check if there is a recommended LiveKit-native solution.

Any suggestions or production patterns would be greatly appreciated.

Thanks!

Raghu_Udiyar · March 10, 2026, 12:55pm

I don’t think the SDK provides it, but you should be able to wrap the response and retry - the application level retry as you mentioned. Otherwise, keeping sufficient headroom, and autoscaling based on the egress metrics is the way to go - thats what we use for scaling egress. There could be optimisation on EC2 boot time as well.

Topic		Replies	Views
Best way to scale LiveKit Egress for recordings (private meetings + livestream platform)? Self Hosting	1	50	March 13, 2026
Self-hosting only the Egress service while using LiveKit Cloud — has anyone tried this? Getting Started	4	60	April 21, 2026
Egress service gets stuck Self Hosting egress	3	90	February 15, 2026
Livekit with ECS Fargate Self Hosting	5	172	February 13, 2026
How Does LiveKit Route Agent Jobs Across Multiple EC2 Instances and Support Autoscaling? Getting Started agent-deployment , livekit-cloud	1	44	June 3, 2026

Autoscaling Strategy for Self-Hosted LiveKit Egress Workers in Real-Time Streaming Platform

Current Problem

Goal

Questions

Related topics