Adaptive interruption for self hosted

For self-hosted, LiveKit documents this:

  • Adaptive interruption uses LiveKit Cloud inference.

  • Self-hosted usage includes 40,000 inference requests/month.

  • For higher volume, they ask you to contact sales.

What happens exactly after quota is exhausted is not currently explicitly documented as a hard error vs silent fallback. Can someone tell what happens when quota is reached? Does it switch to normal VAD based?

If an agent exceeds the included limit, it will no longer have adaptive interruption handling. The agent will gracefully fall back to non-adaptive, which is VAD.

Good call out on the doc gap. We will add clarification to the docs.