We are usually having llm ttft + tts ttfb around 1.2 seconds on avg. But stil e2e latency is much higher. i suspect it was because of end of turn delay as its mostly around 500ms on an avg.
I tried reducing
turn_handling=TurnHandlingOptions(
turn_detection=MultilingualModel(),
endpointing={
"mode": "fixed",
"min_delay": 0.2,
"max_delay": 1.2,
},
interruption={
"enabled": True,
"resume_false_interruption": True,
"mode": "adaptive",
"min_words": 2,
},
preemptive_generation={"enabled": true},
I expect end of turn delay to reduce by 200-300ms but it still same around 500ms.
These settings are passed on agent session and i am using vad silero with value min_silence_duration = 0.2
One of the chat merics data for user turn
“end_of_turn_delay”: 0.5877835750579834,
“started_speaking_at”: 1781695592.438063,
“stopped_speaking_at”: 1781695593.151138,
“transcription_delay”: 0.5335736274719238,
“on_user_turn_completed_delay”: 0.00003577399979803886
One of the chat merics data for ai turn after it
“e2e_latency”: 1.832134485244751,
“llm_node_ttft”: 0.8538708429998678,
“tts_node_ttfb”: 0.38110785300000316,
“playback_latency”: 0.00008678436279296875,
“started_speaking_at”: 1781695594.9832726,
“stopped_speaking_at”: 1781695596.3034112
i am using deepgram nova STT,
Am i missing anything or what should i do to make e2e latency as lower as possible as from what i see if i reduce end of turn delay it will signifcantly reduce e2e but not sure why above changes not work.
i am using livekit-agent, turn detector with version 1.6.0