Clarity on Avatar plugin flows - text/audio streaming

AI_Arjun · April 27, 2026, 7:10am

I wanted to have deeper understanding of how internal flow happens when a session has avatar in it.

My understanding and clarification :

So avatar is a 3rd participant in a conversation. The user, agent and avatar.
User speaks → Agent receives and generates response → the tts audio is sent to avatar → Avatar syncs w audio and publishes audio and video to room. [Q1. Avatar publishes both the audio and video right? Its not that agent gives out the audio and avatar just the video]
How does the text stream flow work here. My assumption is that whatever LLM node generates, gets sent to the rooms text stream without any other intermediaries. So frontend can listen and show it on screen. [Q2. Does this behavior change when an avatar is present? Is the avatar sending out text streams when its there? Wanted what all vectors does introduction of avatar changes]

Please correct me / add more context. Doing this deeper dive as I had been seeing the following issues: Text streams get chopped off in front end but audio is audible. This issue is prominent when using tavus avatar. Was exploring the possibilities of parallely initializing avatars so the audio plumbing work need deeper understanding.

Lindskog_Work · April 27, 2026, 9:35am

I found this very helpful for me - Virtual avatar models overview | LiveKit Documentation.

I have not proven this but I believe the agent stays in the room but becomes hidden.

Topic		Replies	Views
Tavus avatar taking time to initialize (>10s of black out) Agents agent-development , realtime , avatar	4	26	February 27, 2026
Failed to perform clear buffer rpc (Simli Avatar Integration) Agents avatar	10	82	February 28, 2026
Text streams missing in front end Getting Started agent-development , agent-deployment	2	15	April 21, 2026
Agent speaking audio_text tokens out loud Agents llm , openai	4	51	March 6, 2026
Runway Characters avatar support Agents realtime , avatar	1	18	March 18, 2026

Clarity on Avatar plugin flows - text/audio streaming

Related topics