How to fix Latency (3 secs) of Gemini 3.1 Flash Live

If you use the Gemini + LiveKit plugin directly, you’ll likely experience high latency. In my case, it was around 3-3.5 seconds.

After deeper investigation, I found that the TTFP (Time to First Token) was the main issue—it was sitting at ~2 to 2.5 seconds, which is extremely high.

Digging further, the root cause turned out to be enabled tools, specifically the Google Search tool. These tools significantly slow down the agent.

The fix is simple once you identify it:

  • Remove search tools

There is currently no clear mention in documentation that these tools can cause such high latency, which makes this issue harder to diagnose.

Additionally, I’ve seen some discussions suggesting that disabling noise cancellation may help. I haven’t fully verified this, but removing search tools definitely has a major impact.

After removing them:

  • TTFP dropped from ~2–2.5 seconds to ~100 ms (overall timeline 1 to 1.5 seconds)

This improvement is because the model is speech-to-speech and can start generating output tokens before the speaker finishes speaking. In some cases, this can even result in negative latency values, which is ideal for real-time systems.

Conclusion

To significantly reduce latency:

  • Remove search tools

  • Optionally test disabling noise cancellation

This alone can dramatically improve responsiveness.

Search tools make sense, Google Gemini LLM | LiveKit Documentation , I can imagine scenarios where the tool is overly aggressive and performs too many calls, or unnecessary calls. I haven’t tried it myself, and they are still marked as experimental.

I don’t think noise cancellation would be a contributor though, we actually in the docs (for Krisp) that there is ‘…negligible impact on audio latency…’. You do say this is anecdotal though, so I’ll keep my eyes out for other reports.