Hi, I’m getting an average ttft of ~1,12s on gpt-4.1 calls using LiveKit Inference, but the benchmarks claim ~670ms. Thank you in advance for your help!
Hi,
I’m looking at our internal LiveKit inference dashboards for P50 TTFT of gpt-4.1 and I’m seeing current figures in the range of the benchmarks.
Can you please share more about when and how you were testing? Ideally, if this was a spike, or you are seeing these figures sustained.