Agent Session Say method with local audio files

Antonio_Palomino · February 9, 2026, 2:58pm

Hi everyone,

I’m using a custom behaviour in my python server where I handle a text input from my client, then I make a request to another server (of my own) and this response I send it back to the client using AgentSession.say() to generate a video and audio response with LiveAvatar. My problem appear when I implement a way to use the say method with a local mp3 file (I use ffmpeg and pydub AudioSegment to convert from mp3 to PCM and rtc.AudioFrame), if I await by the next way it just never finish the task and block me from start a new task using say method (even when I implemented a AgentSession.interrupt before the say). This is my current implementation:

def mp3_to_audio_frame(path: str)-> tuple[bytes, int, int, int]:
    #MP3 to mono, 16-bit, PCM, 48kHz
    audio = AudioSegment.from_mp3(path)
    audio = audio.set_channels(1)
    audio = audio.set_sample_width(2)
    audio = audio.set_frame_rate(48000)

    raw_data = audio.raw_data
    num_channels = 1
    samples_per_channel = len(raw_data) // (2 * num_channels)
    return raw_data, 48000, 1, samples_per_channel

sync def single_shot_audio(audio_path: str):
    loop = asyncio.get_event_loop()
    raw, sample_rate, num_channels, samples_per_channel = await loop.run_in_executor(None, mp3_to_audio_frame, audio_path)

    try:
        yield rtc.AudioFrame(
            data=raw,
            sample_rate=sample_rate,
            num_channels=num_channels,
            samples_per_channel=samples_per_channel
        )
        await asyncio.sleep(1)
    except asyncio.CancelledError:
        return

async def play_audio_once(session: AgentSession,audio_path):
    path = os.path.join(main_audio_path, audio_path) 
    
    handle_speech = session.say(audio=single_shot_audio(path), text="¡Hola!, soy Sophia. ¿En qué puedo ayudarte?", allow_interruptions=True)
    


In RTC SESSION...
#after start the avatar and session
await play_audio_once(session,saludo_path)

Thank for your help in advance, I hope we can get a solution!

CWilson · February 9, 2026, 6:20pm

Maybe it is getting stuck because of a typo there?

Antonio_Palomino · February 11, 2026, 4:21pm

Thanks I figured out now and it was a weird thing, All I need was to declare this line:

audio = AudioSegment.from_mp3(path)
To this way:

audio: AudioSegment = AudioSegment.from_mp3(path)

Topic		Replies	Views
Want to play a music while executing a tool it should play parallelly with the execution of the api clal Getting Started agent-development	4	60	April 2, 2026
Agent Session Say Done Callback Server SDKs agent-development , python , tts , elevenlabs	8	83	March 9, 2026
Difference between context.wait_for_playout and speechhandle.wait_for_playout Getting Started	6	127	February 17, 2026
Issue with programmatically toggle STT/TTS on off Agents agent-development , python , stt , tts	6	90	February 24, 2026
Agent speaking audio_text tokens out loud Agents llm , openai	4	65	March 6, 2026

Agent Session Say method with local audio files

Related topics