Latency issue how to fix this?

i am using twilio, livekit sip, sarvam for a voice agent. agent taking a 3sec pause for speak.How to fix this?

I am working on a comprehensive guide to identifying and resolving agent latency - unfortunately the first draft was too comprehensive so I’m still working on it.

In the mean time, the best advice is here: Frequently Asked Questions (FAQ) - #13 by darryncampbell

Latency tuning for Voice pipelines is a system dynamics problem; the “optimal” configuration can only be found through systematic measurement and iteration. There are too many variables, and each will need tuning based on the type of provider, model, and parameters you use for every stage, and the context of your use case as well i.e., whether it’s outbound, inbound, the language nuances, code-switching requirements, background noise variance, RAG workflow, shorter or longer responses, etc.

You have to systematically tune, observe, and evaluate. Per-turn metrics are a must - at a minimum you need to measure: VAD, EOU, STT, LLM TTFT, total and TTS TTFB, total.

I have a post planned on our learnings specifically tuning latency for Voice Agents - I’ll link it here once I publish it.

1 Like

I have a post planned on our learnings specifically tuning latency for Voice Agents

@Raghu_Udiyar I Would love to see a a preview :slight_smile:

@Raghu_Udiyar I would also be very interested :eyes: :slight_smile:

We’re still in process of drafting it - we have a series of posts (7 in total) planned on every aspect, and first two posts are live here: What It Takes to Build Production-Grade Voice AI Agents | ByondLabs

Hope this helps - would love any feedback!

2 Likes

Thanks for sharing, I always enjoy reading through the experiences of companies who have deployed voice agents at scale in production.

1 Like

I enjoyed reading this. Thanks for sharing knowledge with the world. Is there a way to subscribe for new posts in the series? If not (or either way), could you post here when a new one is posted? :slight_smile:

1 Like

Thank you! The next post on Latency is here: Voice Agent Latency: The Sub-Second Tuning Playbook | ByondLabs - you can also subscribe to future posts.

2 Likes

Great post on latency. We also turn off noise cancellation to help with latency.

I am stuck at 3 seconds latency on each turn try out openai different models also test the groq models for LLM. For STT cartesia and deepgram are tested out and for TTS openai, cartesia and elevenlabs are tested out but still stuck in 3 seconds delay on each conversation turn. noice cancellation and turn detector is toggle off.

I forgot about this thread, I did post my blog last month: Understand and Improve Agent Latency | LiveKit , currently working on a video for the same.

@Naveed_Ur_Rehman please take a look and see if there is anything there that is helpful. Definitely start by understanding which part of your pipeline is introducing the e2e latency using Agent Observability.

2 Likes

Thank you so much for the reply, Darryn i will definitely look into it.

Thanks, when I switched to the RIME TTS, and also Elevenlab TTS, it worked i got 1 seconds latency, which is fine. I removed noise cancellation and turn detectors. Now it is working fine, even though the turn detector was added again, but the OpenAI TTS model still has three seconds latency