Now that there’s support for thinking realtime models like gpt-realtime-2 and gpt-realtime-1.5 (thanks!) we would like to use it.
In our tests however the thinking models always add seconds of latency for unnecessary thinking or reasoning tasks. So we’d like to set the reasoning_effort to 0 / none, like we already do we gemini native audio with the thinking_budget.
Is there a similiar option? I can’t seem to find it. If not is this something on the roadmap?
Thanks!
No equivalent today. livekit-plugins-openai’s RealtimeModel.__init__ on main (v1.5.8) doesn’t expose reasoning_effort, and the session-update payload sent to OpenAI is built without a hook for arbitrary fields.
Possible enhancement request, and the closest open issue is Support gpt-realtime-2 · Issue #5684 · livekit/agents · GitHub (Support gpt-realtime-2, filed today). Worth either commenting there asking for reasoning_effort exposure specifically, or opening a separate enhancement issue.
Subclassing RealtimeModel and overriding the session-init payload would work as a stopgap, but it’s brittle and wouldn’t ship long-term.