Trying to understand how cloud limit on STT and TTS reset

Kiyado_Labs · May 13, 2026, 8:20am

In my LiveKit project, I had my agent workers running locally. I might not have implemented the best way to clean up socket connections. I attempted to close all development VPS and local agent workers to eliminate WebSocket leaks but unfortunately my project still uses all five of the five available. I’m just starting the development and testing phases and while I can switch to a different plan I still want to be on the build plan during the phase-out.

Could anyone help me with this situation?

Thanks.

darryncampbell · May 13, 2026, 9:53am

I believe you are referring to the LiveKit Inference concurrency, Pricing | LiveKit . This allows for 5 concurrent connections to model providers through LiveKit inference, so if you had STT, TTS and LLM provided through LK inference, this would be 3 concurrent connections. If you launched another session at the same time, this would max out your connections on the free tier.

After your agent leaves the room (or your participant leaves the room, causing the room to shut down and the agent to leave) these concurrent connections are freed automatically, you should not need to worry about any websocket connections yourself as that is handled by LiveKit’s Agent framework.

Kiyado_Labs · May 13, 2026, 10:26am

Thanks for the quick reply. However, the LK inference limits aren’t automatically freeing unless there’s a specified time frame. It’s been nearly 12 + hours and the limit still shows as used.

Type	Limit	Peak usage (past 7 days)
Concurrent participants	100	2
Concurrent Egress requests	2	0
Concurrent Ingress requests	2	0
Concurrent agent sessions	5	0
Agents deployed on LiveKit Cloud	1	1
Concurrent STT	5	5
Concurrent TTS	5	7
LLM requests per minute	100	31
LLM tokens per minute	600,000	39,863

This is the p_3ugvi54jk02 if that helps.

darryncampbell · May 13, 2026, 10:31am

That table is peak usage over the past 7 days, it’s not realtime.

The best table to monitor current use is the session dashboard, https://cloud.livekit.io/projects/p_/sessions (see it shows all your sessions as closed)

You can also use the LiveKit CLI to list the current sessions (rooms):

lk room list

Kiyado_Labs · May 13, 2026, 12:02pm

Seems like. wss://agent-gateway.livekit.cloud/v1/stt — gateway rejects with 429 (no inference credits). Is there no way to top up this credits? without switching to another plan? Can you please advise on this.

Thank you

Muhammad_Usman_Bashir · May 13, 2026, 2:48pm

Cc: @Kiyado_Labs

@darryncampbell, or @CWilson should weigh in on whether Build credits can be topped up without changing plans (commercial question).

For the technical alternative: you can skip LiveKit Inference entirely and plug provider SDKs directly with your own API keys (OpenAI, Deepgram, Cartesia, ElevenLabs, etc.). The agent framework supports all major providers natively; you pay the provider directly, no agent-gateway.livekit.cloud credit gate. Useful for dev/testing where the included Inference credits drain quickly.

darryncampbell · May 13, 2026, 3:01pm

Understood.

No, the thinking behind the build plan is to make it as easy for developers to get started as possible, and that includes not requiring you to enter a credit card.

The amount of free Inference credits we provide on build is in-line with the free allowance if you went to the model providers directly.

Users on the ship plan or higher, where we have a credit card on file, can go over the free limit and are charged accordingly, but there is no option to add a credit card to the build tier.

Kiyado_Labs · May 15, 2026, 11:10am

Yep i have been using them as a fallback. Thanks for letting me know

Kiyado_Labs · May 15, 2026, 11:13am

Not sure if thats a really good developer UX. Free credits can be exhausted pretty quickly after just couple of testing. While most of the developers wouldn’t mind to top-up for inference credits still being on the build plan would have been much more better UX. Yes i understand we can always fallback to the native adapters.

Kiyado_Labs · May 15, 2026, 11:26am

At least allow the option to add a billing debit or credit card to the build plan so inference continues to work after the free credits expire at $2.50. As a developer who wants to test, break and test the cycle, a build plan is the best option in my opinion. Having to pay $50 per month for the ship plan for team collaboration isn’t worth it, especially since the only additional free inference credits are $5.00 if I’m not mistaken. Someone who wants to test the features without upgrading to a different plan and locking billing only for ship and scale plans isn’t the best option either.

Muhammad_Usman_Bashir · May 15, 2026, 6:29pm

@Kiyado_Labs, one hybrid worth trying if you still want some LK Inference exposure on Build: move STT and TTS to direct provider keys (Deepgram, Cartesia, ElevenLabs, etc.) and keep LLM on LK Inference.

STT and TTS hold persistent WebSocket connections for the whole session, so they dominate your inference concurrency. LLM calls are bursty HTTP turns. Pushing the chatty streams off LK Inference onto provider free tiers gives the most headroom per Build credit while still letting you test the LK routing layer.

darryncampbell · May 27, 2026, 7:09am

Hi @Kiyado_Labs , sorry I missed your replies because I was on PTO. I hear what you are saying, and it makes total sense. Although there are no immediate plans to change our billing structure, I’ll advocate for this option when the discussion does come up, since I’m sure others will be in your position.

Topic		Replies	Views
▎ Deployed agent: all in-cluster inference STT/TTS connects fail with 429 (fresh replica, single session) — public gateway works fine Agents agent-deployment , livekit-inference , livekit-cloud	5	32	July 7, 2026
Question about free inference credits limit Getting Started	1	28	May 15, 2026
TTS/STT Inference fails due to APIConnectionError with no clear error message Agents stt , tts , node-js	4	102	March 8, 2026
Error: 429 Too Many Requests on agent-gateway.livekit.cloud Agents agent-deployment	2	65	April 20, 2026
Inference STT WebSocket fails (APIConnectionError) while room connection works Agents agent-development	1	41	March 26, 2026

Trying to understand how cloud limit on STT and TTS reset

Related topics