This question originally came up in our Slack community and the thread has been consolidated here for long-term reference.
I’m looking for real-time STT that can auto-detect language on the fly (no language hint at init) for a voice app.
Ideally it should handle mid-utterance code-switching (e.g., Spanish ↔️ English) with low latency.
What providers/models are people using with LiveKit today? I’m currently using Groq Whisper Large.