Try it live right now: https://www.beatsync.gg/
The idea is that with no additional hardware, you can turn any group of devices into a full surround sound system. MacBook speakers are particularly good.
Inspired by Network Time Protocol (NTP), I do clock synchronization over websockets and use the Web Audio API to keep audio latency under a few ms.
You can also drag devices around a virtual grid to simulate spatial audio — it changes the volume of each device depending on its distance to a virtual listening source!
I've been working on this project for the past couple of weeks. Would love to hear your thoughts and ideas!
There are a ton of directions I can think about you taking it in.
The household application: this one is already pretty directly applicable. Have a bunch of wireless speakers and you should be able to make it sound really good from anywhere, yes? You would probably want support for static configurations, and there's a good chance each client isn't going to be able to run the full suite, but the server can probably still figure out what to send to each client based on timing data.
Relatedly, it would be nice to have a sense of "facing" for the point on the virtual grid and adjust 5.1 channels accordingly, automatically (especially left/right). [Oh, maybe this is already implicit in the grid - "up" is "forward"?]
The party application: this would be a cool trick that would take a lot more work. What if each device could locate itself in actual space automatically and figure out its sync accordingly as it moved? This might not be possible purely with software - especially with just the browser's access to sensors related to high-accuracy location based on, for example, wi-fi sources. However, it would be utterly magical to be able to install an app, join a host, and let your phone join a mob of other phones as individual speakers in everyone's pockets at a party and have positional audio "just work." The "wow" factor would be off the charts.
On a related note, it could be interesting to add a "jukebox" front-end - some way for clients to submit and negotiate tracks for the play queue.
Another idea - account for copper and optical cabling. The latency issue isn't restricted to the clocks that you can see. Adjusting audio timing for long audio cable runs matters a lot in large areas (say, a stadium or performance hall) but it can still matter in house-sized settings, too, depending on how speakers are wired. For a laptop speaker, there's no practical offset between the clock's time and the time as which sound plays, but if the audio output is connected to a cable run, it would be nice - and probably not very hard - to add some static timing offset for the physical layer associated with a particular output (or even channel). It might even be worth it to be able to calculate it for the user. (This speaker is 300 feet away from its output through X meters of copper; figure out my additional latency offset for me.)
Someone brought up the idea of an internet radio, which I thought was cool. If you could see a list of all the rooms people are in and tune it to exactly what they're jamming to.
Just to share a couple of similar/related projects in case useful for reference:
http://strobe.audio multi-room audio in Elixir
https://www.panaudia.com multi-user spatial audio mixing in Rust
Once that changes (at the very least, the macOS part), I can't wait to play with it!
When I was developing Glicol (https://glicol.org/) sync, the main challenge is network jitter. Had to give it up eventually.
Furthermore, have you factored in the synchronization as perceived by the listener?
Also, it seems system-level differences, particularly in audio output latency across various OS and hardware setups, would need to be considered.
What I mean is, the variation in inherent audio output latency between different systems (e.g., Mac vs. Windows, different hardware) could easily exceed 10ms in itself.
Do you have any interesting insight into that question?
My Vision: A web based VLC-type webplayer (capable of VLC level features) with support to distribute Audio channels over connect devices.
Here me out: - Mac as display(Movies screen) - iPad as a Center channel - 4 iPhones as LR and rear channels, (and something for LFE).
Is it Practical? sound cool in my head. What do you guys think??
(As an egregious example, AirPlay 2 has excellent audio sync but latency that is a good fraction of a second or even worse. A browser might be playing through AirPlay.)
Are you already doing latency compensation? You could measure the latency, if one host will become a master and then you could compensate that by delaying the playback of the master a little bit.
For anyone who's curious, Airfoil (a paid app) can play simultaneously from a Mac to a variety of devices:
Although I know nothing about NTP or networking really I appreciate the use of Boring Old Tech for making this awesome software
Have you thought about integrating support for timecode? Dante support also might bring your software to professional venues.
Last I heard safari was buggy and behind on web audio - did you run into any issues there?
Yes it's a super annoying problem. You should change the css so that the url bar is always visible, and have a separate full screen button.