Hi,
I’m using the **LiveKit ESP32 SDK (v0.3.6)** voice_agent example on **M5Stack CoreS3** with the **tempotian/codec_board** component. The board has:
- **Playback:** AW88298 (DAC) on shared I2S TX
- **Capture:** ES7210 (ADC) on shared I2S RX
- **Same I2S port** for both (full-duplex, one I2S_NUM_0 with tx_handle + rx_handle)
**Problem:** Microphone capture works (Room connects, we send audio, agent responds). Subscribed audio is received, Opus is decoded, and PCM is written to the codec (we see `I2S_RENDER: write #N size=1280 max_abs=…` with normal levels). **But there is no sound from the speaker.**
**What we’ve tried:**
1. **Init order:** We tried opening the playback device early in `media_init` (before capture). Then capture opens when the Room connects and reconfigures the shared I2S for input (16k, 4ch TDM). When the first RTP frame arrives, `esp_codec_dev_open(play_handle, &fs)` was returning early because the handle was “already open”, so the output path was never reconfigured after capture had changed the I2S.
2. **Fix in `esp_codec_dev_open`:**
-
For a **shared IN_OUT handle**: if only one direction is open, we now open the other direction (set_fmt, enable, codec set_fs/enable for that direction).
-
For **OUT-only handle already open**: we re-apply output config (set_fmt OUT, enable OUT, codec set_fs, _update_codec_setting) so that after capture has opened and reconfigured the shared I2S, playback is configured again.
3. **No early open:** We removed the early open of the playback device in `media_init`. So capture opens first (when the pipeline starts), then playback is opened on the first received audio frame. We still get no speaker output.
4. **Mute/volume:** We do a mute toggle (mute true → false) and set volume to 100 when opening playback, similar to other working BSPs.
So far the issue persists: data is decoded and written to I2S with reasonable PCM levels, but the physical speaker stays silent. We suspect the **shared I2S + dual-codec (AW88298 + ES7210) init/order** or **AW88298-specific** behaviour.
**Questions:**
1. Is there a **recommended open order or pattern** for boards where capture and playback share one I2S (full-duplex) and use two separate codecs (e.g. AW88298 + ES7210)?
2. Has anyone successfully run the voice_agent example (or minimal with playback) on **M5Stack CoreS3** with speaker output?
3. Any known quirks with **AW88298** in this SDK/codec_board (e.g. mute, format, or enable sequence)?
Happy to share more logs or code snippets if useful. Thanks.
-–
## Logs (paste this block at the end of your post)
```
# Codec / I2S init (at boot)
I (728) CODEC_INIT: Init i2s 0 type: 3 mclk:0 bclk:34 ws:33 din:14 dout:13
I (735) CODEC_INIT: tx:0x3c28102c rx:0x3c2811e8
I (740) CODEC_INIT: output init std ret 0
W (743) i2s_tdm: the current mclk multiple is too small, adjust the mclk multiple to 384
I (751) CODEC_INIT: Input init tdm ret 0
I (806) I2S_IF: channel mode 0 bits:16/16 channel:2 mask:3
I (806) I2S_IF: STD Mode 1 bits:16/16 channel:2 sample_rate:16000 mask:3
I (811) Adev_Codec: Open codec device OK
# After Room connects — capture opens (record path)
I (6932) livekit_example: Room state changed: Connected
I (6935) I2S_IF: channel mode 2 bits:16/16 channel:4 mask:3
I (6946) I2S_IF: TDM Mode 0 bits:16/16 channel:4 sample_rate:16000 mask:3
I (6952) I2S_IF: channel mode 0 bits:16/32 channel:2 mask:3
I (6958) I2S_IF: STD Mode 1 bits:16/32 channel:2 sample_rate:16000 mask:3
I (6965) ES7210: Bits 16
I (6973) ES7210: Enable ES7210_INPUT_MIC1
…
I (6985) ES7210: Enable TDM mode
I (6991) ES7210: Unmuted
I (6991) Adev_Codec: Open codec device OK
I (6904) AUD_PIPE_NEGO: Negotiate return 0 src_format:541934416 sample_rate:16000
I (6914) AUD_PIPE_NEGO: Path mask 1 select sink:0 format 1398100047
I (6920) AUD_SRC: Get rate:16000, ch:1, bits:16
I (6931) I2S_IF: Mode 1 need extend bits 32 to 64
…
I (7092) ESP_GMF_AENC: Open, type:OPUS, acquire in frame: 640, out frame: 100
# When agent sends audio we also see (not in this snippet):
# I (xxxx) I2S_RENDER: open channel:2 sample_rate:16000 bits:16
# I (xxxx) I2S_RENDER: write #1 size=1280 max_abs=… peak=…
# (PCM levels look normal but speaker stays silent)
```