High Latency (~2–3s) in LiveKit Voice Agent with Plivo

Hi team,

I’m using LiveKit 1.5 with Plivo for outbound voice calls and facing latency around ~2–3 seconds before the agent starts speaking.

I’ve already optimized prompt size and reduced tokens significantly, but still seeing noticeable delay in first response.

Is this expected with LiveKit + Plivo, or are there recommended optimizations (pipeline config, streaming, prefetch, etc.) to reduce initial response latency further?

Would appreciate any guidance or best practices :folded_hands:

A 2–3s first-response delay is not inherently expected, but it’s common in STT → LLM → TTS pipelines over telephony due to accumulated latency (STT finalization + turn detection + LLM + first TTS chunk + SIP media path).

To reduce perceived latency:

  • Enable preemptive generation so LLM/TTS start before end-of-turn detection completes: see the Performance optimization section of Agent sessions and Preemptive speech generation.

  • If this is on initial connect, use instant connect (pre-connect audio) to buffer speech while connecting: Instant connect.

  • Use Agent observability to measure where time is spent (STT vs LLM vs TTS).

Are you using the STT–LLM–TTS pipeline or a realtime speech-to-speech model, and which STT/TTS providers are configured?