Best STT Alternative to OpenAI whisper-1 for Japanese in LiveKit

Hi,

I want to ask about best practices for TTS in LiveKit. We currently use OpenAI whisper-1 (realtime) as the STT model in our LiveKit agent to transcribe Japanese utterances, but we sometimes experience delays. Because of that, we’re planning to replace OpenAI whisper-1 with another model. What would be the best choice? Does anyone have experience with this?

I don’t have experience with the Japanese language, but I would definitely try Soniox because it’s in my opinion the best STT out there for non-English languages.

You can Try making use of Deepgram Nova-3 and Nova-2. Also you can try to make use of Nvidia Riva.