Bad LiveKit Inference ttft for gpt-4.1

Douglas_Rocha · April 14, 2026, 8:11pm

Hi, I’m getting an average ttft of ~1,12s on gpt-4.1 calls using LiveKit Inference, but the benchmarks claim ~670ms. Thank you in advance for your help!

darryncampbell · April 15, 2026, 7:33am

Hi,

I’m looking at our internal LiveKit inference dashboards for P50 TTFT of gpt-4.1 and I’m seeing current figures in the range of the benchmarks.

Can you please share more about when and how you were testing? Ideally, if this was a spike, or you are seeing these figures sustained.

Topic		Replies	Views
Query about fastest TTFT livekit inference model Getting Started llm , livekit-inference	4	28	June 17, 2026
Gemini 3 Flash Preview via LiveKit Inference has much higher TTFT/jitter than direct Vertex in same Agents workflow Agents llm	1	27	May 15, 2026
Why is GPT-5.4 pricing via LiveKit Inference about 2x OpenAI direct? Agents livekit-inference	7	78	May 14, 2026
GPT‑Realtime‑2 support Agents realtime , openai	1	99	May 7, 2026
Response.prompt_cache_retention Input should be ‘in-memory’ or ‘24h Agents agent-development , openai	2	53	April 21, 2026

Bad LiveKit Inference ttft for gpt-4.1

Related topics