Docker image size too large with Python LiveKit Agents (~2.7 GB) - is this normal for production?

I’ve been using LiveKit Agents for a while to build voice agents using:

  • Deepgram STT, Google LLM, Cartesia TTS,VAD,Turn Detector

I’m not using LiveKit Inference right now.

I build Docker images with the agent server registered and running, but my final Docker image size always ends up around 2.7 GB.

I’m trying to understand:

  • Is this normal for production-grade voice agents?
  • Are others also shipping containers this large in production?
  • Or am I doing something fundamentally wrong in my Docker setup?

I’d really appreciate guidance on:

  • Best practices for reducing Docker image size
  • Ideal/max recommended image size for production
  • Recommended architecture for handling 100+ calls daily.
  • Whether models/dependencies should be separated into different services
  • Any optimization tips for Python LiveKit agents

Right now, 2.7 GB feels too heavy to confidently deploy in production, so I’d love to hear how others are handling this in real-world deployments.

Thanks in advance for any help or recommendations.

hey @darryncampbell can you help.

I am not sure the expected size but that sounds large to me for the image. Are you using slim base image?

Have you already reviewed these docs?

@Apple_Intelligence 2.7 GB is normal-ish for a Python voice agent bundling Silero VAD + turn detector. The bloat is mostly PyTorch + torchaudio pulled in by those models (1-2 GB alone, even CPU-only), not your code.

Concrete cuts in impact order:

  • python:3.12-slim base. Drops ~900 MB versus full python:3.12.
  • Multi-stage build. Install in a builder stage, copy only the resolved venv into runtime. Strips pip cache, build tools, __pycache__.
  • CPU-only torch wheels. Install from PyTorch’s CPU index so you don’t pull CUDA libs you can’t use.
  • Prewarm models in the builder. Trigger silero.VAD.load() and the turn detector download once during build so they’re baked into the image.

Realistic target: 800 MB to 1.2 GB. Below that needs Alpine or stripping torchaudio.

Yeah, I used the base image from the docs, but it’s still close to 2 GB. I’m including the Turn-D and Silero models in it. In cache. I will go through once I review the docs again. Thanks.

Yeah, I tried it now with the Python 3.12 slim image, and it reduced the size to around 2 GB, but it’s still quite large. I’m also attaching the Turn-D and Silero models in there so they’re cached and don’t need to be downloaded every time.

If possible, could you share any reference Dockerfile where you were able to achieve around a 1 GB image size? That would be really helpful.

and also torch alone taking.

Component Size
torch 322 MB

can I optimized this ?

As a point of comparison, I tried building the agent starter, GitHub - livekit-examples/agent-starter-python: A complete voice AI starter for LiveKit Agents with Python. · GitHub, and it’s coming out at around 1.25GB. I’m not an expert here but I suggest comparing your agent to that baseline and seeing where the additional size is coming from.

@Apple_Intelligence, the starter darryn linked lands at ~1.25GB because of the multi-stage split: build tools and pip/uv cache stay in the builder; production copies only the resolved venv. Going from 2.7GB to 2GB on slim alone means you’re still missing that split. Reproduce the starter’s pattern and you should land near 1.25GB.

Correcting my earlier reply on torch: I rechecked main. livekit-plugins-silero declares onnxruntime only; livekit-plugins-turn-detector declares onnxruntime + transformers. Both use ONNX backends, not torch. So the 322MB of torch in your image is coming from something else in your dep tree. Run pip show torch or pipdeptree -r --packages torch in the built image to find the requirer. Dropping that dependency is your real path to sub-1GB.

For 100+ calls/day, tune num_idle_processes and worker concurrency before splitting services.

@CWilson ,@darryncampbell & @Muhammad_Usman_Bashir Based on your help and guidance, I was able to reduce the Docker image size to 1.2 GB, and I tested it it’s working fine. Thanks for the help, I really appreciate it.