Inconsistent transcripts language when using Gemini realtime model ( gemini-live-2.5-flash-native-audio )

When using the Gemini realtime native audio model, the transcript occasionally returns text in non-English languages even though:

The user is speaking English

The model is configured with language=“en-US”

This happens intermittently within the same session and causes issues for downstream processing that expects a consistent English transcript

Realtime model configuration:

google.realtime.RealtimeModel(
    model="gemini-live-2.5-flash-native-audio",
    voice="Puck",
    temperature=0.3,
    language="en-US",
    location="us-central1",
    vertexai=True,
)

I remember this issue came up previously, but for OpenAI realtime. At the time, the solution was to provide instructions saying to “use English”, and OpenAI had documented that as a recommendation for realtime prompting (link - under the ‘Language Constraint’ section). I can’t find any equivalent advice for Gemini, but it’s probably worth doing.

Okay Thankyou..Will try this

This is a model issue, and not related to the framework or other LiveKit functionality

https://github.com/livekit/livekit/issues/4338