Realtime model with Azure whisper STT

The realtime model natively supports server side OpenAI whisper so I a little unclear why you would want to use an external Azure Whisper? Is it significantly better performance than OpenAI whisper?

We are interested also in an external STT with Realtime model for transcripts given the more recent STT engines exist wth higher transcription performance than Whisper, but would ideally have the STT as an independent layer which does not affect the model itself. I think that is possible with livekit if we disable realtime model server side transcription, although one concern we have with that idea is that this blog Developer notes on the Realtime API suggests they rely on the server side transcription for cost efficiency

The GA service will automatically drop some audio tokens when a transcript is available to save tokens.

So I think we would need both server side realtime Whisper transcription (for the model input/cost efficiency) and an external livekit ran STT…but I think livekit is not really built with the idea of two simultanous STT engines like that, so am currently thinking it just might not be possible in Livekit as-is, as @darryncampbell wrote.