I’m troubleshooting a voice quality/input issue on inbound phone calls routed through Telnyx to a LiveKit voice agent (French language use case).
Symptoms:
Caller audio is often too low for reliable STT pickup unless the caller speaks loudly.
This is much more noticeable on real smartphone/PSTN calls than on desktop/local mic tests.
Current stack:
Telnyx SIP trunk → LiveKit → Deepgram STT
STT: deepgram nova-3 (fr)
VAD/turn detection: Silero VAD + turn_detection=“vad”
What we already tested:
Removing noise cancellation on the LiveKit side improved local/desktop behavior.
Issue remains on telephony calls.
Are there recommended Telnyx settings to improve inbound speech level/clarity for AI voice agents?
Any known best-practice codec strategy for AI STT reliability on mobile/PSTN callers?
I assume you do not have noise cancellation enabled on the trunk. Right?
There is only so much that can be done once we receive the audio. If there are too many PADs in the loop, it can be hard to recover an intelligible signal. You may want to address this with your trunk provider instead of trying to fix it after the loss.
I ran more tests with Telnyx, and audio quality is noticeably better when I force G.711 only, on the SIP connection.
So you were right, the issue seems to be happening upstream, likely during codec negotiation and/or transcoding, rather than something LiveKit can fully recover afterward.
I’m now investigating the trunk configuration first.
@Christophe_Chapiteau, your G.711 finding lines up with the standard recommendation for telephony AI agents. For French/EU PSTN, prefer PCMA (G.711 a-law) on the Telnyx trunk, not PCMU. PCMU is the US default and on EU calls usually adds an extra transcode hop. Forcing PCMA-only in the Telnyx SIP Connection codec preferences keeps the path PSTN > Telnyx > LiveKit with no codec conversion, which is the cleanest signal Deepgram will see.
On the VAD question: Silero is solid for telephony as a VAD. If you want stronger turn detection than VAD-only, switch turn_detection from “vad” to the MultilingualModel from livekit-plugins-turn-detector. It’s a semantic, multilingual turn detector that handles French. Fewer false interrupts on noisy lines than VAD-only.