Audio Isolation/Volume control in LiveKit SIP Rooms

Hello,
We’re trying to prevent two SIP participants in the same room from hearing each other directly, while still letting the agent hear both.

  • Inbound staff leg (SIP → LiveKit)

    • Caller dials a Twilio number.

      • Twilio either:

        • uses a SIP trunk directly into LiveKit’s SIP dispatch rule, or

        • uses TwiML sip:lexi-inbound@… to reach LiveKit.

    • Result: staff is a SIP participant in a LiveKit room.

  • Outbound User leg (LiveKit → SIP → PSTN)

    • Our agent code calls CreateSIPParticipantRequest in outreach_handler.py to dial the patient via the same Twilio trunk.

    • Result: user is a second SIP participant in the same LiveKit room.

  • Agent

    • A livekit.agents AgentSession with a custom Interpreter (basically the standard voice quickstart + extra outreach logic) is attached to that room.

    • Agent subscribes to both SIP participants for STT and publishes TTS back into the room.

So the room ends up with 3 participants:
caller(SIP), user (SIP), agent (WebRTC/voice agent)

We attempted to use RoomService.UpdateSubscriptions to prevent each SIP participant from subscribing to the other SIP participant’s audio tracks.

async def _disable_direct_staff_user_audio(room: rtc.Room) -> None:
    # Ensure remote human participants in this room are NOT subscribed
    # to each other's audio tracks (staff ↔ patient).
    remote_parts = list(room.remote_participants.values())
    if len(remote_parts) < 2:
        return

    async with api.LiveKitAPI() as lkapi:
        for subscriber in remote_parts:
            block_sids: list[str] = []
            for other in remote_parts:
                if other.identity == subscriber.identity:
                    continue
                for pub in other.track_publications.values():
                    if pub.kind == rtc.TrackKind.KIND_AUDIO:
                        block_sids.append(pub.sid)

            if not block_sids:
                continue

            await lkapi.room.update_subscriptions(
                api.UpdateSubscriptionsRequest(
                    room=room.name,
                    identity=subscriber.identity,
                    track_sids=block_sids,
                    subscribe=False,
                )
            )

We’ve verified on the Twilio side:

  • No TwiML conferences.

  • Call logs show separate legs:

    • Incoming PSTN → Twilio → LiveKit SIP.

    • LiveKit SIP → Twilio → PSTN (patient).

  • So the only place left to mix them appears to be inside LiveKit.

Issue: Caller does not hear User which is good but User can still hear Caller or if we can lower the volume

Interesting use case, I fed the question into Devin and it gave a plausible reason, though I don’t agree with its alternative approaches: using separate rooms feels like a very large overhead, and muting the tracks will prevent the agent from hearing the caller. I think the correct approach would be to define custom track permissions for your participants: Camera & microphone | LiveKit Documentation, but others in this forum may have a better suggestion.

Thanks @darryncampbell I will try the approach you have suggested.

Hi @darryncampbell I did tried defining custom track permissions but User can still Hear caller at full volume