Streaming with Elixir and how would that scale?

Does anyone have any idea on how streaming platforms work at scale and in general?

It took me 2 days to figure it out that I need to implement first mile and last mile architecture e.g. [user stream via RTP/RTMP] -> [server -> transcoding to HLS which segments the video] -> [server uploads to CDN] -> [user consume from CDN with different devices]

I took a look at membrane framework and it has most of transcoders and protocols implemented yet I still don’t understand when does a “stream key” comes into play (I’d assume its a token that then can route your stream across the server to the appropriate folder on the CDN or something like that)

How would that thing scale? for each incoming stream I’d create a process that does the following [user stream via RTP/RTMP] -> [server -> transcoding to HLS which segments the video] -> [server uploads to CDN] -> [user consume from CDN with different devices] ?

And there is this whole story with signaling over a socket which I assume I’ll be able to achieve easily with Phoenix Channels.

Lots of load balancers, powerful machines for transcoding and fast CDNs, it isn’t black magic, but video in general it is hard enough to make you question your life choices.

For scale you’re going to need something like k8s or other way to scale horizontally when you start hitting 100% use in all your transcoding servers, big streaming providers have access to servers optimized for real time transcoding, using nvenc or even proprietary transcoders (I know that twitch uses his own to provide low latency streaming) will surely be more cost efficient than pure CPU based ones.

You can get by with a lot of reasonably cheap servers if the streamers configure their software to stream in the correct format and you don’t need multiple multiple resolutions and frame rates, but the moment that you need it, you’re going to need to distribute the load to more powerful servers.

As for stream key, it is just a token that identifies who is streaming, the logic for what to do with it is up to the implementation.

IMO, the easiest way to start getting the gist of it is following any nginx-rtmp tutorial and expanding your knowledge from there.

As for the socket signaling, the thought that came to my mind is that since you’re talking about signaling over sockets is that this is about WebRTC, not HLS or DASH, since in these cases any signaling would be on the manifest, WebRTC is another can of worms and isn’t usually used for large scale streaming but for video calls and other truly low latency applications (sub 500 milliseconds), there are ways to use it for large scale truly low latency streamin, but the CDNs that support acting like WebRTC relays are few and the chances of you having to implement one yourself are big, and now we’re with a lot more problems than the one of streaming video :man_shrugging:

3 Likes

I thought the socket signaling is necessary for informing the page/video player that the user is streaming. At least it seem like it for me, is not?
Thanks for your reply though, very informative!

I heared about WebRTC but from what I understood it really is p2p system, not something to do with the arch I described above.

Also about the nginx-rtmp I’ve got myself familiar with it already :slight_smile:

thanks again.

I heared about WebRTC but from what I understood it really is p2p system, not something to do with the arch I described above.

WebRTC is not necessarily needs to run in mesh topology (tough I guess with WebRTC is most straight forward solution). It could be setup with SFU (Selective Forwarding Unit - when each peer sends video in one stream to the server and receives multiple streams from the same server) MCU (Multipoint Conferencing Unit - when each client sends only one stream to the server and receives on “multiplexed” stream of others in one stream).

As far as I understand Janus supports both SFU and MCU… and there is a talk https://www.youtube.com/watch?v=3qTUYMOpZGs

I also see there are membrane_weqrtc_plugin and membrane_rtc_engine (SFU client and server libraries)

I am more interested in the convenient solution e.g. RTP/RTMP to HLS, but thanks for the insight :slight_smile:

Not necessarily, I went for the WebRTC answer because signaling has a specific and well defined meaning, but for HLS and DASH the player loads a manifest and if the manifest is from a live video it will play in live mode, the way you inform the page that there’s a live stream happening is completely independent from the player itself, you could do with websockets or any other way, but as far as the video is concerned, as long as the player has a manifest for a live video loaded, it will work.

Unfortunately never worked with membrane, usually I go with shaka streamer if I need something quick working, but membrane is definitively on my list of things to learn.

2 Likes