Lowest latency STT/TTS/LLM stack for German - what's your experience?

Hey everyone,

We have a voice AI pipeline running on LiveKit Agents that works incredibly well for English:

  • STT: Deepgram Nova-3
  • TTS: Cartesia Sonic Turbo
  • LLM: GPT-4o Mini

Latency is fantastic in English. Now we need to support German and we’re not sure how well this exact stack holds up for German specifically.

A few things we’d love to hear about:

  • Anyone running Deepgram Nova-3 for German? How’s the accuracy and latency compared to English? Any issues with compound words, umlauts, or mixed language input?
  • How’s Cartesia Sonic Turbo’s German pronunciation quality? Natural sounding or robotic?
  • For the LLM side, is GPT-4o Mini solid enough for German conversations or did you find a better alternative?
  • Has anyone tried a completely different stack for German that gave better results? (Soniox, ElevenLabs, different models, etc.)
  • What E2E latency numbers are you seeing with German?

Our top priority is minimum latency. Any real-world experience or benchmarks would be super helpful. Thanks!

1 Like

I would suggest keeping the same stack and test it out with real calls. The stack is not the only variable in for latency, it depends on many factors. Good observability and tuning iteratively is what I would suggest. We have a German Agent deployed in production that uses Azure STT, gpt-4.1-mini and Cartesia TTS.