Agent didn't followup after tool call

In a certain session, the agent didn’t followup after calling a tool - said it’ll do something, invoked the tool, then didn’t followup. Any attempts to nudge it by saying something resulted in the same behavior happening again - saying it’ll do something, calling the tool, and then not following up.

I couldn’t reproduce this in other sessions, but I suspect it’s likely to happen again, even if rarely.

Any idea why this happened?

I’m using xAI Realtime with the Node SDK.

Agent id: A_9ZooAu8swT7M

Session id: RM_Z3H9LfYTWbzT

@royibernthal
This is likely the behavioral difference between xAI Realtime and OpenAI Realtime in how post-tool turns are triggered.

With xAI (and other non-OpenAI realtime models), LiveKit swaps tools for the reply then restores them, but that restoration doesn’t automatically trigger a follow-up turn the way OpenAI Realtime does. Grok just goes silent if it doesn’t self-initiate.

Two things to try: explicitly call generate_reply() after your tool returns, and check max_tool_steps. If the limit is hit and the final LLM call produces no audio with xAI, you’d see exactly this pattern.

The inconsistency across sessions points to model-side behavior rather than your code.

We won’t have any helpful information in our logs as we do not log your agents logs.

Have a look at your agent logs to see what may have happened:

Also it maybe helpful to check Agent Insights too if you have that enabled

@royibernthal, the agents framework handles the multi-step case explicitly: after tool execution, if num_steps >= max_tool_steps + 1, it forces tool_choice="none" on the follow-up call to “guarantee a final text response instead of silently stopping” [ agent_activity.py:2855-2914 ].

xAI Realtime under that tool_choice="none" constraint can return text without producing audio. That fits your symptom: tool fires, then silence. The intermittency would line up with whether the specific turn happens to hit max_steps_reached or draining state.

Practical implications can be: raise max_tool_steps if your tool chains are deeper than the default, and add a defensive watchdog on the post-tool turn:

  session.on(voice.AgentSessionEventTypes.MetricsCollected, (ev) => {
    // if the post-tool turn produced no audio output, nudge once
    if (ev.metrics.outputTokens === 0) {
      session.generateReply()
    }
  })

That same handler doubles as confirmation if you’re trying to reproduce, outputTokens=0 on the post-tool RealtimeModelMetrics proves the empty-audio path.