I’ve noticed that only sometimes, the LLM will say something before/as it’s making a tool call, and seems to do so in parallel. Is this true and is this default behavior?
I would think that this should be a setting in the Agent class, something like “always say something as you make a tool call” (← as a bool toggle).
Is this only something that you can put in the system prompt, and just hope that your LLM conforms?
Some LLMs are better at tool calling than others, and will follow your instructions better.
Rather than modify the prompt to instruct your LLM to “say something”, it would be more reliable to instruct the LLM to “say nothing” in your tool description, then add a session.say() at the beginning of the tool call so you can guarantee something will be said.
hmm, is it possible/feasible in the future to force the LLM to generate pre-speech always via a toggle on the livekit end, or would this just be a wrapper on generate_reply? So the answer is this is and will always be an LLM plugin-side thing?
bumping this. I really think this should be a tool-level toggle so we don’t have to rely on prompt engineering. Especially important when what the tool returned is a large body of text that we can’t send in to session.say().
By default, LiveKit does not enforce any “pre-speech before tool call” behavior. Tool calling is driven by the LLM via the llm_node, which can yield either plain text, a tool call, or both in the same streamed turn. If the model emits text before (or alongside) a tool call, that text goes through TTS; if it emits only a tool call, nothing is spoken unless you explicitly say something. This is model behavior, not a LiveKit toggle. See Tool definition & use and Pipeline nodes & hooks.
There is currently no built-in tool-level boolean like “always speak before the tool.” The reliable pattern is to speak from inside the tool using session.say() or session.generate_reply() (available via RunContext) so you deterministically control pre-speech. Speech control primitives are documented in Agent speech and audio.
If you want this enforced globally, you can override llm_node() to intercept tool calls and inject session.say() before executing them.