Hi team,
I’m using LiveKit 1.5 with Plivo for outbound voice calls and facing latency around ~2–3 seconds before the agent starts speaking.
I’ve already optimized prompt size and reduced tokens significantly, but still seeing noticeable delay in first response.
Is this expected with LiveKit + Plivo, or are there recommended optimizations (pipeline config, streaming, prefetch, etc.) to reduce initial response latency further?
Would appreciate any guidance or best practices 
A 2–3s first-response delay is not inherently expected, but it’s common in STT → LLM → TTS pipelines over telephony due to accumulated latency (STT finalization + turn detection + LLM + first TTS chunk + SIP media path).
To reduce perceived latency:
-
Enable preemptive generation so LLM/TTS start before end-of-turn detection completes: see the Performance optimization section of Agent sessions and Preemptive speech generation.
-
If this is on initial connect, use instant connect (pre-connect audio) to buffer speech while connecting: Instant connect.
-
Use Agent observability to measure where time is spent (STT vs LLM vs TTS).
Are you using the STT–LLM–TTS pipeline or a realtime speech-to-speech model, and which STT/TTS providers are configured?