LiveKit inference for gemini 3.1 flash lite when?

Tristan_Morris · March 11, 2026, 2:34am

Gemini 3.1 flash lite is 2.5x faster than gemini 2.5 flash and it is also has better intelligence scores than gemini 2.5 flash. I think this would be nice to have 3.1 flash lite available for its time to first answer token response time and it’s overall speed. Having colocation with the voice agent would make STT → LLM → TTS → telephony pipeline even lower latency.

https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-flash-lite/

darryncampbell · March 11, 2026, 9:14am

You can access https://ai.google.dev/gemini-api/docs/models/gemini-3.1-flash-lite-preview through the Plugin today:

llm=google.LLM(
    model="gemini-3.1-flash-lite-preview",
),

It is not yet available through LiveKit inference, but it should be soon

Yethu_Krishnan · March 15, 2026, 1:05pm

We are waiting for that.

Tristan_Morris · April 3, 2026, 4:57pm

it’s live btw it seems like it might be slightly adding latency instead of decreasing latency I’m not sure yet. Need to do more testing

Topic		Replies	Views
Gemini 3.1 Flash Lite Preview Getting Started python , llm	2	55	April 3, 2026
Gemini 3.1 flash live Getting Started	9	422	April 8, 2026
Gemini 3.1 Flash Live not working with LiveKit Agents 1.5.1 Getting Started	3	121	April 8, 2026
Livekit gemini Plugin , Gemini 3.1 flash live preview is not supported on 1.5.2 Agents agent-development , python , plugin , google	1	52	April 13, 2026
Trying gemini 3.1 flash live and I can't seem to make it start talking? Getting Started	6	196	April 8, 2026

LiveKit inference for gemini 3.1 flash lite when?

Related topics