Hi all,
I’m seeing an intermittent but severe failure in a Python LiveKit Agents deployment and wanted to check whether anyone else has run into this.
Setup:
- LiveKit Agents Python SDK 1.5.2
- Python worker
- ai-coustics noise cancellation enabled with the Quail model
What happens:
- A session is running normally
- The last normal log is typically something like the user beginning to speak
- After that, there is a long gap with no STT events, no transcription events, and no downstream activity
- Eventually the worker is killed by the supervisor with logs like:
process is unresponsive, killing process- exit code
-10
What makes this concerning is that the failure appears to happen before STT is activated, so it looks like the audio/input pipeline is getting blocked upstream rather than a normal STT/LLM/TTS exception. There is no useful traceback when it happens.
Current hypothesis:
I suspect the issue may be related to ai-coustics Quail in the input pipeline, possibly blocking or stalling audio processing under some conditions. I’m removing ai-coustics for now to see whether the issue disappears.
Questions:
- Has anyone seen ai-coustics / Quail cause worker hangs or audio pipeline stalls in Python agents?
- Are there recommended timeout / watchdog / fallback patterns for the audio input path?
- If this is not likely ai-coustics, are there other parts of the pre-STT pipeline I should inspect first?
Any guidance would be really appreciated. This is intermittent but production-impacting because it causes the whole worker handling the session to die.