Capturing Agent speech in on_enter of a Task in Tests

I’m writing E2E tests for a multi-task voice agent using AgentSession. My top-level agent sequentially awaits child tasks, each of which calls session.say() in its on_enter().

The test flow looks like:

initial_result = await session.start(agent, capture_run=True) # captures on_enter speech

turn1 = await session.run(user_input=“…”) # captures User Input

After a task transition happens:

parent agent starts a new child task → on_enter() calls session.say(“…”)

turn2 = await session.run(user_input=“…”) # only captures user input, NOT the new task’s agent on_enter speech

The problem: When a task transition happens after session.run() completes, the new task’s on_enter() speech (via session.say()) is not captured in any RunResult. The RunResult from session.run() is already done before the transition, and the next session.run() only captures user input.

With session.start(agent, capture_run=True), the initial on_enter speech is captured. Is there an equivalent mechanism for task-to-task transitions that happen after session.run() completes?

There is nothing equivalent. I am trying to find a similar example I can point you towards and the closest is our surf desk example, python-agents-examples/complex-agents/doheny-surf-desk/tests/test_agent.py at main · livekit-examples/python-agents-examples · GitHub, but that doesn’t test E2E, it tests the tasks individually.

So I see you follow the same strategy by just capturing the user input in your tests:

# NameTask asks for name first
result1 = await session.run(user_input="My name is Alex Johnson")
result1.expect.next_event().is_function_call(name="record_name")

# Wait a bit for next task to start
await asyncio.sleep(0.1)

# PhoneTask asks for phone
result2 = await session.run(user_input="My phone is 949-555-1234")
result2.expect.next_event().is_function_call(name="record_phone")
result2.expect.skip_next_event_if(type="function_call_output")

# Confirm phone
await asyncio.sleep(0.1)
result2_confirm = await session.run(user_input="Yes, that's correct")
result2_confirm.expect.next_event().is_function_call(name="confirm_phone")

# AgeTask asks for age
await asyncio.sleep(0.1)
result3 = await session.run(user_input="I'm 25 years old")
result3.expect.next_event().is_function_call(name="record_age")
result3.expect.skip_next_event_if(type="function_call_output")

# Confirm age
await asyncio.sleep(0.1)
result3_confirm = await session.run(user_input="Yes, that's correct")
result3_confirm.expect.next_event().is_function_call(name="confirm_age")

We use session.say at the beginning of the task. is there way to assert that it was called with certain input?

Not with the evaluation framework, you would need an end-to-end audio test for that - there are lots of 3rd party frameworks available and we don’t recommend a specific one, but https://www.cekura.ai/ was discussed a lot previously in the community (though I have no experience)

Alternatively, you could just test the string you are passing to the greeting message conforms to a specific string, outside of the evaluation framework:

def test_greeting_message():
    assert "hello" in get_greeting_message().lower()