LiveKit SDK Version
v0.3.6
IDF Version
v5.4.1
Issue Description
Summary
I have a working ESP32-S3 voice bot firmware using WebSockets and am attempting to migrate to the
LiveKit ESP32 SDK (v0.3.6). My hardware uses two separate audio codecs — ES8311 for speaker output
and ES7243E for microphone input — both sharing the same I2C bus and I2S channels. I’ve run into
a fundamental incompatibility between the initialization sequence my hardware requires and the
sequence assumed by esp_capture / av_render as used in the SDK examples.
Hardware Configuration
- Board: ESP32-S3N16R8
- Speaker codec: ES8311 (DAC) via I2C address
0x18 - Microphone codec: ES7243E (ADC) via I2C address
0x10 - Shared I2C bus: SDA=GPIO18, SCL=GPIO17 (both codecs on I2C_NUM_0)
- Shared I2S: MCLK=GPIO16, BCLK=GPIO9, WS=GPIO7, DOUT=GPIO8, DIN=GPIO10
- Sample rate: 16kHz mono, 16-bit, Philips/I2S standard mode
- Power Amplifier: GPIO21
Working Init Sequence (Current Firmware)
In my working WebSocket-based firmware, the correct hardware initialization order is:
- i2c_bus_init() — Initialize I2C master bus
- es7243_init() — Configure ES7243E ADC mic via I2C (sets sample rate, gain, etc.)
- es8311_codec_init() — Configure ES8311 DAC via I2C
- i2s_init() — Initialize I2S channels (TX + RX, stereo, 16kHz)
- pa_enable(true) — Enable power amplifier
The critical point: both codecs must be configured over I2C before I2S is started, because
both codecs depend on the MCLK signal coming from the I2S master to lock their PLLs and internal
clocks. If I2S starts before codec init, the codecs receive clock signals they haven’t been
configured to handle yet. In practice, the ES7243E is particularly sensitive to this ordering.
Issue with LiveKit SDK’s esp_capture / av_render Pattern
Looking at the SDK examples (media.c), the implied initialization pattern through esp_capture
and av_render appears to be:
- i2c init (inside codec driver, triggered by esp_capture setup)
- i2s init (inside esp_capture/av_render pipeline setup)
- codec init (after I2S is running)
This ordering does not work for my hardware. Specifically:
- Dual-codec topology is undocumented : There are no examples or guidance for boards where the mic ADC and speaker DAC are separate chips sharing a single I2S bus.
- Shared I2S bus conflict: esp_capture initializes an I2S RX channel and av_render initializes an I2S TX channel. On my board both TX and RX are on the same I2S peripheral (I2S_NUM_0, full-duplex). It is unclear whether these two components can share a single I2S peripheral or whether they each try to claim it independently, causing a double-init conflict.
- No hook for pre-I2S codec configuration: There is no apparent way to run custom I2C codec setup before esp_capture triggers I2S initialization internally.
Questions / Feature Requests
- Is there a supported way to use ES7243E (record-only) with
esp_capturealongside ES8311 for playback? If not, can a custom capture source be provided that bypasses the codec driver layer but still feeds raw audio into the LiveKit pipeline? - Can
esp_captureandav_rendershare a single full-duplex I2S peripheral? If yes, how should this be configured? - Is there a lifecycle hook to perform codec-specific I2C configuration before I2S channels are enabled? A
pre_i2s_initcallback in the capture/render config structs would solve this cleanly. - What is the recommended migration path for custom-hardware boards that don’t match the reference board (e.g. ESP32-S3-BOX) and have non-standard dual-codec topologies?
