High latency (5-8 seconds) with Google Gemini Realtime plugin over SIP

LiveKit-Community · January 21, 2026, 3:52pm

This question originally came up in our Slack community and the thread has been consolidated here for long-term reference.

I’m working on a voice assistant using LiveKit SIP (with Twilio) and the Google Gemini Realtime plugin. Calls connect successfully and conversation flows, but I’m facing 5-8 second latency for agent replies.

I’ve tested both self-hosted (Docker Compose) and LiveKit Cloud - latency is the same on both.

My stack:

Model: gemini-2.5-flash-native-audio-preview via google.realtime
SIP: Twilio Elastic SIP Trunk

Optimizations tried:

Added silero.VAD for faster speech-end detection
Added turn_detector.EnglishModel() for turn-taking
Reduced min_silence_duration to 0.4

Has anyone else experienced this delay with the Gemini Realtime plugin?

LiveKit-Community · January 21, 2026, 3:52pm

Others have experienced latency with Gemini Realtime as well. Some observations:

Timeout errors: Check your logs - the Gemini realtime plugin can run into timeout errors frequently, which contributes to lag. If you’re on the free tier Gemini API, this is more common.

Tool calls: If you’re using tool calls, they might be slowing things down.

Alternative models: For comparison, GPT-4o and GPT-4o-mini can achieve sub-1000ms latency. While Gemini is good, it’s currently not as reliable or consistent as OpenAI’s models in terms of latency and response quality.

VAD tuning: Your approach of reducing min_silence_duration is correct. You might also experiment with other VAD parameters.

krishna.kanjani · March 9, 2026, 9:31am

Hello, I am facing the same issue for a long time now. Can you please suggest what other parameters can be used to fix this? Or is this an expected behavior??

My stack:

Model: gemini-2.5-flash-native-audio-preview-12-2025 via google.realtime
SIP: Twilio Elastic SIP Trunk and Plivo SIP Trunk

Optimizations tried:

Added silero.VAD for faster speech-end detection with min_speech_duration=0.1, min_silence_duration=0.25, activation_threshold=0.5, deactivation_threshold=0.40, prefix_padding_duration=0.0

darryncampbell · April 8, 2026, 12:59pm

This question has a lot of views, so I will improve the update the answer.

The best general resource for understanding and improving agent latency is this blog:

To address some of the specifics in the question:

The original question, under ‘optimizations tried’, implies the OP is using LiveKit’s turn detection. Gemini, like other Realtime models, has its own built-in turn detection which should be used unless there is a good reason you need a separate turn detection model: Gemini Live API plugin | LiveKit Documentation
I have seen reports that the provider tools in Gemini Live can add latency, so it is worth testing without those.

Topic		Replies	Views
Gemini Realtime latency spikes and unexpected call termination Telephony sip-twilio , sip-plivo , gemini , sip-trunking	2	120	March 16, 2026
How to fix Latency (3 secs) of Gemini 3.1 Flash Live Telephony other , gemini	1	144	April 3, 2026
Livekit realtime using gemini-live-2.5-flash-native-audio Getting Started	1	56	April 27, 2026
Latency issue how to fix this? Getting Started	13	441	April 13, 2026
Gemini 3.1 Flash Live model giving 2 major issues Getting Started	2	108	June 5, 2026

High latency (5-8 seconds) with Google Gemini Realtime plugin over SIP

Related topics