This question originally came up in our Slack community and the thread has been consolidated here for long-term reference.
I’m self-hosting LiveKit agents on AWS ECS Fargate. Everything works except that when tasks are downscaling, some active sessions get disrupted.
What’s the recommended workaround for this? Should I switch to ECS with EC2 launch type?
Fargate is not recommended for LiveKit agents because ECS Fargate has a hard shutdown limit of 120 seconds. Sessions longer than 2 minutes can be terminated abruptly.
Solution: Use ECS with EC2 launch type instead.
For EC2 launch type, configure the drain timeout:
- Set
drain_timeout in your worker configuration: Server options | LiveKit Documentation
- Set
stopTimeout in your ECS task definition to be larger than the drain_timeout
This allows active sessions to complete before the instance terminates.
See this KB article for additional tips on AWS deployments: