Gpt-realtime-2 set reasoning_effort to none or very low

schedawg74 · May 8, 2026, 12:59pm

Now that there’s support for thinking realtime models like gpt-realtime-2 and gpt-realtime-1.5 (thanks!) we would like to use it.

In our tests however the thinking models always add seconds of latency for unnecessary thinking or reasoning tasks. So we’d like to set the reasoning_effort to 0 / none, like we already do we gemini native audio with the thinking_budget.

Is there a similiar option? I can’t seem to find it. If not is this something on the roadmap?

Thanks!

Muhammad_Usman_Bashir · May 8, 2026, 1:12pm

No equivalent today. livekit-plugins-openai’s RealtimeModel.__init__ on main (v1.5.8) doesn’t expose reasoning_effort, and the session-update payload sent to OpenAI is built without a hook for arbitrary fields.

Possible enhancement request, and the closest open issue is Support gpt-realtime-2 · Issue #5684 · livekit/agents · GitHub (Support gpt-realtime-2, filed today). Worth either commenting there asking for reasoning_effort exposure specifically, or opening a separate enhancement issue.

Subclassing RealtimeModel and overriding the session-init payload would work as a stopgap, but it’s brittle and wouldn’t ship long-term.

Topic		Replies	Views
Gpt-Realtime 2: Experience so far? Agents agent-development , llm , realtime , openai	2	171	May 9, 2026
Feature request: Gemini thinkingLevel=minimal for faster voice-agent TTFT Agents llm , gemini	8	108	May 27, 2026
Livekit Inference no-thinking config for google gemini 2.5 flash model Getting Started livekit-inference	4	85	March 26, 2026
Livekit Inference thinking configuration for gemini 2.5 and 3.5 flash Agents agent-deployment , llm , livekit-inference	4	71	May 28, 2026
Response.prompt_cache_retention Input should be ‘in-memory’ or ‘24h Agents agent-development , openai	2	66	April 21, 2026

Gpt-realtime-2 set reasoning_effort to none or very low

Related topics