TTS/STT Inference fails due to APIConnectionError with no clear error message

Since yesterday evening, without code changes. The TTS/STT inference stopped working and throwing an unclear APIConnectionError.

I am using a basic AgentSession implementation:

const session = new voice.AgentSession({
vad,
stt: “assemblyai/universal-streaming:en”,
llm: “”, // no setting, due to custom llm-node implementation
tts: “inworld/inworld-tts-1.5-max:Ashley”,
turnDetection: new livekit.turnDetector.MultilingualModel(),
userData: sessionUserData,
});

I did some more testing yesterday evening, starting many short sessions after each other. I wonder if this is any rate limit, or the basic inference credits are used up. But I have no way to see, how much inference credits (I am on the free plan) are left or get a clear rate limit error message here. I would appreciate any help.

Here is the error message for reference:

{“level”:20,“time”:1772209121989,“pid”:50728,“hostname”:“Joshuas-Macbook-Pro-14.local”,“name”:“lk-rtc”,“msg”:“Connect callback received”}
│[08:18:41.992] INFO (50728): Creating speech handle
│ speech_id: “speech_aad7be7c-a31”
│[08:18:42.000] INFO (50728): participantValue.trackPublications
│ participantValue: “user_389mw1IiueRgcTpfiFQA7dTLGr1”
│ trackPublications: [
│ {
│ “info”: {
│ “sid”: “TR_AMmAYFoJiV2jsU”,
│ “name”: “”,
│ “kind”: “KIND_AUDIO”,
│ “source”: “SOURCE_MICROPHONE”,
│ “simulcasted”: false,
│ “width”: 0,
│ “height”: 0,
│ “mimeType”: “audio/red”,
│ “muted”: false,
│ “remote”: true,
│ “encryptionType”: “NONE”,
│ “audioFeatures”:
│ },
│ “ffiHandle”: {},
│ “subscribed”: false
│ }
│ ]
│ lengthOfTrackPublications: 1
│[08:18:42.244] WARN (50728): failed to synthesize speech, retrying in 0.1s
│ tts: “inference.TTS”
│ attempt: 1
│ error: {
│ “body”: null,
│ “retryable”: true,
│ “name”: “APIConnectionError”
│ }
│[08:18:42.376] WARN (50728): failed to synthesize speech, retrying in 2000s
│ tts: “inference.TTS”
│ attempt: 2
│ error: {
│ “body”: null,
│ “retryable”: true,
│ “name”: “APIConnectionError”
│ }
│AI SDK Warning System: To turn off warning logging, set the AI_SDK_LOG_WARNINGS global to false.
│AI SDK Warning (google.generative-ai / gemini-2.5-flash): The feature “specificationVersion” is used in a compatibility mode. Using v2 specification compatibility mode. Some features may not be available.
│[08:18:43.565] WARN (50728): failed to recognize speech, retrying in 2000s
│ tts: “inference.STT”
│ attempt: 3
│ error: {
│ “body”: null,
│ “retryable”: true,
│ “name”: “APIConnectionError”
│ }
│[08:18:44.511] WARN (50728): failed to synthesize speech, retrying in 2000s
│ tts: “inference.TTS”
│ attempt: 3
│ error: {
│ “body”: null,
│ “retryable”: true,
│ “name”: “APIConnectionError”
│ }
│[08:18:45.701] ERROR (50728): AgentSession is closing due to unrecoverable error
│ type: “stt_error”
│ label: “inference.STT”
│ recoverable: false
│ error: {
│ “body”: null,
│ “retryable”: true,
│ “name”: “APIConnectionError”
↑ ↓ - Select │ }

Update: more specific error message
{“nestedErrorName”:“APIConnectionError”,“nestedErrorMessage”:“Error connecting to LiveKit WebSocket”},

having issue for TTS as well , its juts not wrking

This came up last week, and it turned out that the user had exceeded their inference credits.

@Joshua_Kraft if I look at your project id p_xxxggazf I see it has exceeded its concurrent STT quota (under project settings, then scroll to the bottom)

The error message is poor, and the team are working on improving it, apologies for the inconvenience.

@aniket I see limit use on your concurrent TTS, so I imagine yours is the same root cause - you should also have received emails (possibly in spam) warning you you were close to your limits.

Just to follow up, the team have implemented changes so ‘quota exceeded’ errors should be more obvious moving forward, thanks again for reporting.