How to dynamically switch TTS language based on LLM response

This question originally came up in our Slack community and the thread has been consolidated here for long-term reference.

I’m building a multilingual voice agent using LiveKit agents with LangChain + ElevenLabs TTS. The agent needs to dynamically switch TTS voice/language based on what language the LLM responds in (e.g., user speaks Spanish → LLM replies in Spanish → TTS should use Spanish voice).

My current approach is polling the LangGraph state every 0.5s to detect new AI messages, run language detection, and call tts.update_options(). But this creates a race condition - TTS often starts speaking before the monitor detects the language change.

Is there a cleaner way to hook into the TTS pipeline before audio generation starts?

Override the tts_node method in your agent. The tts_node receives an AsyncIterable[str] of text chunks from the LLM. You can process this stream to detect language before forwarding to TTS:

async def tts_node(self, text: AsyncIterable[str], model_settings: ModelSettings):
    async def process_text():
        accumulated_text = ""
        async for chunk in text:
            accumulated_text += chunk
            
            # Detect language once you have enough text
            if len(accumulated_text.strip()) >= 20:  # threshold
                detected_lang = detect(accumulated_text)[0]['lang']
                if detected_lang != self.current_language:
                    self.tts.update_options(
                        language=detected_lang,
                        voice_id=LANGUAGE_VOICE_MAPPING[detected_lang]
                    )
                    self.current_language = detected_lang
            
            yield chunk
    
    return Agent.default.tts_node(self, process_text(), model_settings)

See the example: