How to run a custom speech detection model on LiveKit Cloud

This question originally came up in our Slack community and the thread has been consolidated here for long-term reference.

We’re interested in developing a device-directed speech detection model and running it on our LiveKit Cloud server, possibly with similar performance requirements to the turn detection model. Is it possible to bring our own model to run in LiveKit Cloud?

You can use a custom speech detection model by implementing the stt_node:

This allows you to integrate your own speech detection pipeline while still leveraging LiveKit’s infrastructure.