Good Morning Everyone,
I’m Elijah and I want to share an architectural pattern I recently implemented for orchestrating multiple AI agents within a single LiveKit room. When you move from a single agent to a crew of specialized agents in the same room, you quickly hit a synchronization wall: agents talking over each other , redundant processing of the same audio track , And no shared sense of “who is doing what right now.” Here’s how I’m using LiveKit’s room primitives and Data Channels to build a shared “brain state” so agents can stay aware of each other’s actions in real time.
three agents in the same LiveKit room, each subscribed to the user track, coordinated via Data Channel state.
Screenshot 2: example JSON payload showing agent_id, status, and action_lock used for synchronization.
Live view of a three‑agent round‑robin session in a single LiveKit room, where each agent takes a turn responding while coordination happens behind the scenes via Data Channel state messages.
The Architecture: How It’s Wired
1. The Room Topology Instead of spinning up separate rooms, all agents connect to the same primary Room as independent participants.
- Every agent subscribes to the user’s audio track.
2. The Awareness Layer (LiveKit Data Channels) This is the core of the synchronization pattern. Relying on transcript completion is too slow for real-time orchestration. Instead, I use LiveKit Data Channels as a high-speed, decentralized event bus.
-
When Agent A detects an intent it needs to act on, it instantly broadcasts a JSON state payload via the Data Channel.
-
Agent B (and any other agent in the room) receives this event in milliseconds.
-
The payload includes fields like agent_id, status: “processing” | “idle”, and an action_lock boolean. When another agent sees action_lock: true for a given task, it temporarily mutes or pauses its own response pipeline until it receives a status: “resolved” event.
3. State Reconciliation By using Data Channels, the agents aren’t just reacting to audio; they are reacting to each other’s internal execution states. If an agent is executing a background task, the other agents know to either hold the user’s attention or stand by.
The Takeaway
Using Data Channels for inter-agent communication completely eliminated audio collisions and allowed for seamless “handoffs” between specialized models. LiveKit handles the routing and latency so well that the agents feel like a single, cohesive intelligence.
I’m currently exploring better ways to share conversation context across agents so they don’t each burn tokens re‑parsing the same user prompt, while still keeping per‑agent prompts specialized.
Has anyone else experimented with multi-agent routing using this kind of pub/sub pattern over Data Channels? I’d love to see how others are handling the state locks or if there are cleaner ways to manage the track subscriptions dynamically!


