One-Way Audio — agent audio not reaching caller when SIP participant joins room before agent

We hit a one-way audio issue on an inbound SIP call where the agent’s greeting was generated and published to the room but never reached the caller’s phone.

Same caller made 4 calls to the same trunk within 15 minutes. 3 worked fine, 1 had this issue.
The only difference found in these 4 calls is that on the broken call, the SIP participant was already in the room when the agent worker joined. On all 3 working calls, the agent joined first and the SIP participant connected after.

So our hypothesis is that when the SIP participant is already in the room and the agent joins and publishes a new audio track, the room-level subscription completes (unblocking capture_frame), but the SIP bridge’s RTP forwarding to the phone isn’t ready yet — so the initial audio is lost.

Has anyone encountered similar issues and what could be a good fix to it?

So, only the agents greeting is missed? After that the call proceeds normally.

You could wait_for_participant() as shown in this example: Workflow & setup | LiveKit Documentation , then check whether the participant has published an audio track (participant.track_publications.values()) to test your idea, then listening for a track_published event if the audio track was not yet published.

I am not sure whether your issue is originating on the Agent side or a SIP issue. For the agent issue, look in your agent logs.

If you suspect this is a SIP issue, I am working on a debugging series about this very thing right now, and I wish I could just share it because I talk about how to debug this exact issue.

Here is an excerpt from that section of the series. This is missing all the background, but I hope it is good enough until I can wrap up the full series release. I tried to scrub all the links that are not live yet, but please forgive if I missed some links that currently go nowhere.


Debugging Media Issues

One-way audio, no audio, codec mismatches – how to diagnose problems that happen after the call connects.

Diagnosing audio problems that happen after the SIP call connects: no audio, one-way audio, and quality issues.


The Call Connected, But Something’s Wrong

This is one of the most frustrating categories of SIP problems. The signaling worked perfectly – INVITE, 200 OK, ACK, everything looks fine. But when the caller speaks, nobody hears them. Or the agent talks, but the caller hears silence. Or the audio sounds like a garbled mess.

These are media issues that live in the RTP and SDP layers, not in SIP signaling. The SIP messages said, “call connected.” The problem is that the audio isn’t flowing correctly.

The Three Most Common Media Problems

The three most common media problems and their typical root causes.

flowchart TD
  Problem["Media Problem"] --> NoAudio["No Audio<br/>(silence both directions)"]
  Problem --> OneWay["One-Way Audio<br/>(only one side hears)"]
  Problem --> BadQuality["Bad Audio Quality<br/>(choppy, robotic, garbled)"]

  NoAudio --> NA1["SDP IP is 0.0.0.0"]
  NoAudio --> NA2["Firewall blocking RTP"]
  NoAudio --> NA3["NAT not traversed"]
  NoAudio --> NA4["Media port is 0"]

  OneWay --> OW1["NAT issue<br/>(one direction blocked)"]
  OneWay --> OW2["Asymmetric routing"]
  OneWay --> OW3["SDP direction mismatch"]

  BadQuality --> BQ1["Packet loss > 1%"]
  BadQuality --> BQ2["High jitter > 30ms"]
  BadQuality --> BQ3["Codec mismatch"]
  BadQuality --> BQ4["Clock rate mismatch"]

  style Problem fill:#ffd8a8,stroke:#f59e0b,color:#1e1e1e
  style NoAudio fill:#ffc9c9,stroke:#ef4444,color:#1e1e1e
  style OneWay fill:#ffc9c9,stroke:#ef4444,color:#1e1e1e
  style BadQuality fill:#ffc9c9,stroke:#ef4444,color:#1e1e1e
  style NA1 fill:#e5dbff,stroke:#8b5cf6,color:#1e1e1e
  style NA2 fill:#e5dbff,stroke:#8b5cf6,color:#1e1e1e
  style NA3 fill:#e5dbff,stroke:#8b5cf6,color:#1e1e1e
  style NA4 fill:#e5dbff,stroke:#8b5cf6,color:#1e1e1e
  style OW1 fill:#e5dbff,stroke:#8b5cf6,color:#1e1e1e
  style OW2 fill:#e5dbff,stroke:#8b5cf6,color:#1e1e1e
  style OW3 fill:#e5dbff,stroke:#8b5cf6,color:#1e1e1e
  style BQ1 fill:#e5dbff,stroke:#8b5cf6,color:#1e1e1e
  style BQ2 fill:#e5dbff,stroke:#8b5cf6,color:#1e1e1e
  style BQ3 fill:#e5dbff,stroke:#8b5cf6,color:#1e1e1e
  style BQ4 fill:#e5dbff,stroke:#8b5cf6,color:#1e1e1e

Let’s tackle each one with the specific Wireshark techniques to diagnose them.

Step 1: Always Start with the SDP

Before you even look at RTP packets, check the SDP. The SDP offer (in the INVITE) and SDP answer (in the 200 OK) contain everything you need to know about how media was supposed to flow.

Use this Wireshark filter to see only the SDP-carrying messages:

sip.Method == "INVITE" || (sip.Status-Code == 200 && sip.CSeq.method == "INVITE")

For each side, extract these critical values:

SDP Field What to Check Where to Find It
c= (Connection) Media IP address Should be a routable IP, NOT 0.0.0.0
m= (Media) Port and codec list Port should be non-zero, and codecs should overlap
a=rtpmap: Codec details Both sides must agree on at least one codec
a=sendrecv / a=recvonly / a=sendonly Direction Must be compatible between the offer and the answer

The SDP Comparison Table

Build this table for every media issue investigation:

Property SDP Offer (INVITE) SDP Answer (200 OK) Match?
Media IP 198.51.100.10 203.0.113.20 Both routable?
Media Port 10000 20000 Both non-zero?
Codecs offered PCMU, PCMA, telephone-event PCMU, telephone-event Overlap exists?
Direction sendrecv sendrecv Compatible?

If anything in this table looks wrong, you’ve likely found your problem before even looking at RTP.

Diagnosing No Audio (Silence Both Directions)

Check 1: Is RTP Actually Flowing?

rtp

If you see zero RTP packets after the ACK, the media never started. This is almost always a network-level problem: firewall, NAT, or routing.

Check 2: Is the SDP Media IP Reachable?

Look at the c= line in both SDPs. If either one shows 0.0.0.0, that’s the problem:

c=IN IP4 0.0.0.0    ← This means "no media" (hold or error)

An IP of 0.0.0.0 in the SDP means the sender is explicitly saying “don’t send me audio.” This sometimes happens when:

  • A call is being put on hold intentionally
  • A misconfigured server defaults to 0.0.0.0 instead of its real IP
  • NAT traversal logic failed to determine the public IP

Check 3: Is the Media Port Zero?

m=audio 0 RTP/AVP 0    ← Port 0 means "stream rejected"

A port of 0 in the m= line means the media stream was explicitly rejected. The codec negotiation failed, or the answerer chose not to accept the audio stream.

Check 4: Firewall Blocking RTP?

RTP uses UDP on dynamic ports (typically 10000-60000). If the SDP shows valid IPs and ports but no RTP flows, a firewall is likely blocking UDP traffic on those ports.

Check with tshark:

# Show what ports the SDP promised
tshark -r capture.pcap -Y "sdp" -T fields \
  -e sdp.media.port

# Show what ports RTP actually uses
tshark -r capture.pcap -Y "rtp" -T fields \
  -e udp.srcport -e udp.dstport | sort -u

If the SDP says port 20000 but no RTP packets appear on port 20000, the packets are being dropped somewhere between the two endpoints.

Diagnosing One-Way Audio

One-way audio is the classic NAT problem. The call connects, and one person can hear the other, but not vice versa.

Check 1: RTP Stream Count

Go to Telephony > RTP > RTP Streams. You should see two streams – one in each direction:

Expected What You See Problem
2 streams (both directions) 2 streams Healthy
2 streams 1 stream only One-way audio – one direction is blocked
2 streams 0 streams No audio at all

If you only see one stream, note its direction. The missing stream tells you which path is blocked.

Check 2: SDP Direction Attributes

a=sendrecv    ← Both sides send and receive (normal)
a=sendonly    ← This side only sends (caller won't hear anything)
a=recvonly    ← This side only receives (caller can't speak)
a=inactive    ← No media in either direction

If the SDP answer says recvonly instead of sendrecvLiveKit will receive audio but won’t send any back. This is a misconfiguration in the SDP negotiation.

Check 3: NAT Causing Asymmetric RTP

NAT causes one-way audio when the SDP advertises a private IP that the other side cannot reach.

The most common cause of one-way audio is NAT. Here’s the scenario:

sequenceDiagram
  participant Caller as Caller<br/>Private: 192.168.1.50<br/>Public: 198.51.100.10
  participant LK as LiveKit<br/>203.0.113.20

  Note over Caller: SDP says c=192.168.1.50<br/>(private IP!)

  Caller->>LK: INVITE (SDP: c=192.168.1.50)
  LK-->>Caller: 200 OK (SDP: c=203.0.113.20)
  Caller->>LK: ACK

  Caller->>LK: RTP (from 198.51.100.10:10000)
  Note over LK: RTP arrives from 198.51.100.10<br/>but SDP said 192.168.1.50
  LK--xCaller: RTP sent to 192.168.1.50:10000
  Note over LK: 192.168.1.50 is unreachable!<br/>One-way audio!

LiveKit receives the caller’s audio (because it comes from the public IP via NAT), but when LiveKit tries to send audio back to the private IP listed in the SDP, the packets can’t reach it.

How to spot this in Wireshark:

  1. Look at the c= line in the INVITE’s SDP – is it a private IP (192.168.x.x, 10.x.x.x, 172.16-31.x.x)?
  2. Look at the actual source IP of the RTP packets from the caller – is it different from the SDP c= line?
  3. If yes, that’s the NAT problem.

The fix: The SIP provider needs to use proper NAT traversal (STUN/TURN/ICE) so the SDP contains the public IP, not the private one.

Diagnosing Bad Audio Quality

The call connects and audio flows in both directions, but it sounds terrible. Time to analyze the RTP streams.

RTP Stream Analysis

Go to Telephony > RTP > RTP Streams, select a stream, and click Analyze.

Metric Healthy Degraded Broken
Packet Loss < 0.5% 1-3% > 5%
Max Jitter < 5ms 5-30ms > 30ms
Mean Jitter < 2ms 2-10ms > 10ms
Delta (packet spacing) ~20ms consistent Variable Highly erratic

Packet Loss

Filter for RTP and look at sequence numbers:

rtp.ssrc == 0x12345678

(Replace with the actual SSRC from the RTP stream list.)

In the packet list, check the sequence numbers. They should increment by 1 for every packet. Gaps mean lost packets. Wireshark highlights these.

Effect on audio: 1-2% loss causes occasional clicks or dropouts. 5%+ makes speech unintelligible.

Jitter

Jitter is the variation in packet arrival timing. RTP packets should arrive every 20ms like clockwork. If they arrive at 20ms, 25ms, 15ms, 35ms, 10ms – that’s high jitter.

In the RTP stream analysis window, the jitter graph shows this visually. Spikes in jitter correspond to moments when the audio sounded bad.

Effect on audio: High jitter causes the jitter buffer to underrun or overflow, resulting in choppy audio, gaps, or robotic-sounding speech.

Codec Mismatch

If both sides agreed on a codec in the SDP but the RTP packets use a different payload type, you have a codec mismatch.

Check the RTP payload type:

tshark -r capture.pcap -Y "rtp" -T fields \
  -e rtp.p_type | sort -u

Compare the payload types in the RTP packets against what the SDP agreed on:

SDP Agreement RTP Payload Type Status
PCMU (0) 0 Correct
PCMU (0) 8 Mismatch – sending PCMA instead
PCMU (0) 111 Unknown payload type – possible dynamic codec issue

Effect on audio: The receiving side tries to decode the audio with the wrong codec. The result is loud static noise or garbled sound, often described as a “fax machine” sound.

Quick Diagnostic Commands

These tshark One-liners help you quickly assess media health from the command line:

# Count RTP packets per direction
tshark -r capture.pcap -Y "rtp" -T fields \
  -e ip.src -e ip.dst | sort | uniq -c

# Check for RTP payload type consistency
tshark -r capture.pcap -Y "rtp" -T fields \
  -e rtp.p_type | sort | uniq -c

# Check RTCP statistics (if available)
tshark -r capture.pcap -Y "rtcp" -T fields \
  -e rtcp.sender.packetcount \
  -e rtcp.roundtrip-delay \
  -e rtcp.jitter

# Extract SDP media lines from all SIP messages
tshark -r capture.pcap -Y "sdp" -T fields \
  -e sdp.media \
  -e sdp.connection_info.address

# Find RTP packets with unexpected gaps in sequence numbers
tshark -r capture.pcap -Y "rtp" -T fields \
  -e rtp.seq -e frame.time_delta_displayed | head -50

The Media Debugging Checklist

# Check Wireshark Location What You’re Looking For
1 SDP IPs routable? INVITE/200 OK body > c= line Not 0.0.0.0, not private IP
2 SDP ports non-zero? INVITE/200 OK body > m= line Port > 0
3 Codecs agree? Compare offer vs. answer a=rtpmap At least one common codec
4 Direction compatible? a=sendrecv / a=recvonly etc. Both sides can send and receive
5 RTP flows both ways? Telephony > RTP > RTP Streams Two streams, one per direction
6 Packet loss low? RTP Stream Analysis < 1%
7 Jitter low? RTP Stream Analysis < 20ms
8 Payload type matches SDP? RTP packet p_type field Matches agreed codec number

Key Takeaways

Problem Primary Cause Where to Look
No audio SDP has 0.0.0.0 IP or port 0, or a firewall blocking UDP SDP in INVITE/200 OK, then RTP filter
One-way audio NAT – SDP contains a private IP that the other side can’t reach c= line in SDP vs. actual RTP source IP
Bad audio quality Packet loss, jitter, or network congestion Telephony > RTP > RTP Streams > Analyze
Garbled audio Codec mismatch – RTP payload doesn’t match SDP agreement RTP p_type vs. SDP a=rtpmap

3 Likes