zeroexcuses
Erlang/elixir sample code for real time voice chat?
-
I understand the argument of “erlang has it’s roots in telephony, of course erlang/elixir is a great fit for this”
-
I understand that voice chat is “basically just a router for udp packets, and elixir has great libraries for those”
-
For whatever reason, looking at old threads, I am not finding a good simple example:
Question: Is there any example of using erlang/elixir to build a simple voice chat server (I don’t care if the client is desktop software or browser/rtc), I’m just looking for something where erlang/elixir itself does the ‘heavy lifting’ (rather than offloading the work to some other server).
Thanks!
Marked As Solved
mat-hek
-
It’s an SFU server
-
You can easily find a lot of information about WebRTC on the net, for example at https://webrtchacks.com/. In short, it goes like that:
- before starting the transmission, the session is negotiated (network addresses, certificates, tracks, codecs etc) using the SDP offer-answer model and Interactive Connectivity Establishment (ICE) candidates, usually via WebSocket/Phoenix channel
- media connection is established via ICE
- media encryption keys are exchanged through the media connection via DTLS
- then the browser gets the track (in this case audio samples) from the microphone, encodes it (usually with OPUS), packs it into Real-time Transport Protocol (RTP) and encrypts it (so it becomes SRTP)
- the encrypted stream is sent through the ICE media connection (usually via UDP)
- if the connected peer is the Membrane server, it unpacks the audio (OPUS) stream with Membrane.WebRTC.EndpointBin. That bin consists of the ICE endpoint with DTLS handshake, and the RTP bin, that handles SRTP, SRTCP and some of their extensions.
- at that point, you can do anything with the received stream, while the Membrane server just passes it to Endpoint Bins of all other peers in the room
- each Endpoint Bin packs the stream to RTP, encrypts to SRTP and sends via ICE sink
- each peer’s browser gets the stream, decrypts, unpacks, decodes to raw audio and plays out
Also Liked
mickel8
A bunch of useful links:
Multimedia in general
Membrane
WebRTC
mat-hek
Agreed ![]()
Playing with media is not as simple as you’d expect, unfortunately
We’re building WebRTC SFU server, that handles both audio and video. We made it work, but It’s still experimental and extensively developed.
That’s mostly true, though Elixir is very convenient for handling protocols and containers too, because of its good support for binaries and bitstrings. Heavy, numerical computations are indeed delegated to low-level, mostly C libraries, but since it’s done via simple NIFs or C nodes, it doesn’t involve
If you deal with latencies around 2ms, then it may be a problem - haven’t tested. But for usual media streaming it’s good enough. People even write streaming apps in Go, that has stop-the-world GC - that one happens to be problematic though, AFAIK.
We didn’t have time for big optimizations of Membrane yet, neither we used it with OTP 24 JIT. Anyway, it’ll probably never be as fast as if we used Rust or C. Membrane focuses rather on reliability, scalability and maintainability.
RudManusachi
My understanding of such project is: elixir/erlang is a great fit of “orchestration”… connecting people, setting the pipeline of encoding/decoding/streaming via web/etc. But underneath those encoder/decoder are still delegated to some low level libs.
Though, please note, that I’m not an expert in the the subject and all my experience for realtime audio/video chat is limited by one project where it’s done via WebRTC and Elixir/Phoenix server was just authorizing and connection people.
I think you might be interested in Membrane Framework. Looks like they provide with plugins to use audio codecs and there is a repo with some demos.








