๐Œ๐ฎ๐ฅ๐ญ๐ข-๐๐ž๐ฏ๐ข๐œ๐ž ๐œ๐จ๐จ๐ซ๐๐ข๐ง๐š๐ญ๐ข๐จ๐ง ๐ฎ๐ฌ๐ข๐ง๐  ๐‹๐ข๐ฏ๐ž๐Š๐ข๐ญ ๐’๐ƒ๐Š๐ฌ: ๐š๐œ๐จ๐ฎ๐ฌ๐ญ๐ข๐œ ๐ฌ๐ฒ๐ง๐œ ๐ญ๐จ ๐ฏ๐ข๐๐ž๐จ ๐ฌ๐ฒ๐ง๐œ

In 2019, I built and patented a system for handling 100+ mobile devices in a single physical room with all microphones and speakers active. Mixed Android and iOS, multiple models and OS versions, running over WebRTC. Deployed at scale to replace hardware Cisco conference phones with mobile-device hubs in meeting rooms.


The hard part isnโ€™t just multi-mic feedback amplification. Itโ€™s that even when one person speaks, their voice reaches each deviceโ€™s microphone at slightly different times because the distance from the speaker to each device varies. At room scale thatโ€™s tens of milliseconds of differential propagation. Transmit all N mic streams upstream without coordination and you get audible comb-filtering and ghosting. Play received audio on all N speakers without sync and the same effect runs in reverse.

The architecture is ๐ฅ๐ž๐š๐๐ž๐ซ-๐œ๐ฅ๐ข๐ž๐ง๐ญ ๐œ๐จ๐จ๐ซ๐๐ข๐ง๐š๐ญ๐ข๐จ๐ง ๐ฉ๐ž๐ซ ๐š๐œ๐จ๐ฎ๐ฌ๐ญ๐ข๐œ ๐ซ๐จ๐จ๐ฆ. ๐Ž๐ง ๐ญ๐ก๐ž ๐ฆ๐ข๐œ ๐ฌ๐ข๐๐ž, a selection algorithm picks the device with the best signal in real time and only that device transmits upstream. ๐Ž๐ง ๐ญ๐ก๐ž ๐ฌ๐ฉ๐ž๐š๐ค๐ž๐ซ ๐ฌ๐ข๐๐ž, the leader synchronizes playback timestamps across clients so the same audio frame plays at the same wall-clock moment on every device. US patent: โ€œMethod and System for Teleconferencing Using Coordinated Mobile Devicesโ€ (https://patents.justia.com/inventor/muhammad-usman-bashir).

๐‘๐ž๐œ๐ž๐ง๐ญ๐ฅ๐ฒ we at RTC LEAGUE extended the LiveKit Android and iOS client SDKs to add this coordination layer for a different use case: distributed media playback. The mic-selection-and-sync primitive generalizes naturally. ๐ƒ๐ž๐ฆ๐จ: a grid of mobile devices playing the same YouTube video frame-synced across all of them, effectively turning a stack of phones into a coordinated mobile-device billboard.

๐˜๐จ๐ฎ๐“๐ฎ๐›๐ž ๐ฌ๐จ๐ฎ๐ซ๐œ๐ž ๐ฎ๐ฌ๐ž๐ ๐ข๐ง ๐ญ๐ก๐ž ๐๐ž๐ฆ๐จ: https://www.youtube.com/watch?v=lgm3puP3tMA

๐’๐ฒ๐ง๐œ๐ž๐ ๐ฆ๐ฎ๐ฅ๐ญ๐ข-๐๐ž๐ฏ๐ข๐œ๐ž ๐ฉ๐ฅ๐š๐ฒ๐›๐š๐œ๐ค ๐๐ž๐ฆ๐จ: https://drive.google.com/file/d/1NTtyiz3lLATnwbaYdFNXkE2UquxBbj_F/view?usp=sharing

The primitives generalize across acoustic feedback elimination, conference room mic selection, and distributed media sync. Happy to dig into any of it if it overlaps with what others are building.

Cc: @Flamingo_Live @LiveKit-Community