Does anyone have any experience with WebRTC STUN & TURN server called XTurn?

wanton7 · March 29, 2019, 10:42am

Our company is thinking of adding voice chat to our browser app in the future with WebRTC. I noticed that company called Xirsys released and open sourced Elixir based STUN & TURN server last year. Anyone has any experience with it?

More info

https://medium.com/xirsys/xirsys-releases-xturn-the-open-source-turn-server-in-elixir-c84348289acc

idi527 · March 30, 2019, 10:45am

It has unnecessary deps like :maru (more deps -> more auditing required), so that’s a red flag for me, personally. Both stun and turn protocols are not particularly complicated, so most companies I’d think could support an in-house implementation, but just in case, there are also https://github.com/processone/stun and https://github.com/esl/mongooseice which I’ve used and which worked well. But I’ve only ever needed and used them for supporting STUN since my webrtc server was on public network, and I didn’t need to relay any packets via TURN servers.

LSylvester · March 30, 2019, 3:38pm

Hi @wanton7. I am the author of that server. It is a fully capable WebRTC server and is a project that Xirsys intends to expand on. Can I be of service in any way?

I am giving a talk using XTurn in July at CommCon in London, btw.

Regards,
Jahred

idi527 · March 30, 2019, 4:51pm

Not everyone needs an http api endpoint backed into a turn server. Might be a good idea to provide it as a separate. optional package.

wanton7 · March 30, 2019, 4:51pm

Our company is starting to create new version of our web app from scratch and we might be moving to Elixir from C#. In one part of our app we would like to have voice chat between our users in these maybe max 10 user sessions. One requirement is that it needs to be self-hosted.

Yesterday I showed my boss https://jitsi.org multi-platform open-source video conferencing and now he like to see if we could implement our voice chat with it and not create our own.

It’s probably not close as scalable or fault tolerant compared to something running in BEAM. We also need to see see if we can customize it enough for our needs. If that fails we will be creating our own solution. Maybe we could use XTurn or derive our solution from it if that happens.

Really like to see that talk but can’t be there in person, is it going to be recorded?

idi527 · March 30, 2019, 4:52pm

You might consider https://github.com/meetecho/janus-gateway as well. Implementing webrtc server in erlang is quite hard right now (I know only of one successful closed-source attempt) due to inability to multiplex srtp/stun/turn on dtls connections.

For example, when I tried making a basic webrtc server about a year ago in elixir, I had to deal with stun binding requests being sent before dtls hello which crashed erlang’s dtls process … So I just went with janus and it worked out of the box.

wanton7 · March 30, 2019, 4:59pm

That API is for authorization right? If it is we need something like that. So that our server can create tokens for our client to access those SNAP and TURN servers to protect them from unauthorized access. I also came across SNAP/TURN server that seemed popular called coturn and it also has API like that.

idi527 · March 30, 2019, 5:00pm

If it’s a standalone server, sure, api is needed, but if it’s used as a library, it’s not. I’d rather have my own http api server, which depends on a turn server library.

wanton7 · March 30, 2019, 5:04pm

True, also if that API was pluggable everyone could easily create their own API that fits their needs.

wanton7 · March 30, 2019, 5:52pm

Thanks I’ll look into it!

LSylvester · March 30, 2019, 6:00pm

The included API is a starting point. The purpose of the server is to provide full WebRTC capabilities as simply as possible. This way, developers can shape it however they choose without hours spent learning how it works. Maru was chosen as it’s lightweight, but it can easily be ripped out and replaced.

The server supports full DTLS and we’ve experienced no issues with it. Running on a $5 virtual machine, I have achieved faster benchmarks than Google’s own servers.

As stated, this is just a starting point, but it’s a good starting point with less headaches.

The talk I’m giving in July is regarding using XTurn to debug WebRTC apps. Since its so simple, it’s perfect for finding flaws in such apps. I believe it is being recorded.

Another point you may find interesting; we are working with the Membrane Framework team to implement Membrane Source pads for video and audio pipelines in the server. I am currently working on a “record to file” example. It may prove useful to you.

idi527 · March 30, 2019, 6:06pm

The purpose of the server is to provide full WebRTC capabilities as simply as possible.

The server supports full DTLS and we’ve experienced no issues with it.

But that’s just for signalling and turn, right? Or do you plan to implement all webrtc protocols required for a webrtc client/server?

LSylvester · March 30, 2019, 6:12pm

It’s just for TURN. The server is STUN / TURN only, but will have RTP / RTCP and various audio / video decoding / encoding capabilities.

We find signalling tends to be a very personal matter. We have our own signalling services, but more people roll their own than opt for something pre-existing. Since signalling is typically just WebSockets, it’s not something that needs to be explained overly much.

lud · July 2, 2019, 2:32pm

Hello, sorry for bouncing this old topic.

I would like to build a simple web page where two users could see and talk to each other through WebRTC. I’m a bit confused by what I read so far and I cannot figure out what I need.

I believe that if the web server (the app serving the web page where the WebRTC javascript) is on a public domain (say mydomain.com), I just need a public STUN server (say turn.mydomain.com) right ?

I was looking to MongooseICE (I like ESL products) but they say that their application is not enough to have video chat, because of signaling. But If I understood correctly this page the javascript on it describes how to do signaling with phoenix channels. The code is old, I did not have time to make it work yet. But that would be enought ?

Or do I need TURN, janus-gateway, SNAP-thing, DTLS-stuff … ? (I try to read as much as I can but it feels like if someday you discovered that “HTTP” thing and suddenly you stumble upon Javascript, CSS, TLS, FTP, browsers, servers and al.)

LSylvester · July 2, 2019, 2:50pm

Hi @lud,

You need a TURN server (like https://xirsys.github.io/xturn) and a socket (signalling) server. Both apps could be completely unrelated. For instance, you could use PubNub for signalling. It is needed only to get the session running. After that, the TURN server takes over. TURN is only needed for about 20% of connections, with the rest being peer-to-peer.

Incidentally, my company Xirsys provides both geolocated TURN servers and signalling. The TURN server listed above is an Elixir TURN server that I wrote and will be discussing at CommCon next week.

Regards,
Jahred

lud · July 2, 2019, 3:05pm

Hey, Thanks !

So TURN is needed for 20% of the connections, because people can be behind a VPN or a NAT, that is the reason if I got things well.
But a TURN server relays the data (in my case the Audio/Video streams) so I must be a good machine, whereas a STUN server and the Phoenix app will run on a cheap server, right ?

My knowledge level on all of this is like ZERO. I do this to learn new stuff (bored with CRUD apps), so I would like to implement a simple scenario. Do you have examples or docs to work with Xturn from JS ?

LSylvester · July 2, 2019, 3:33pm

So, a TURN server is also a STUN server. TURN extends the STUN specification by adding throughput capabilities. Any TURN server will also perform STUN bindings (but not the reverse).

TURN is used for situations where one peer can’t see the other clearly. All peers will be behind a NAT (the router that sits between your machine and the public internet) and some will be behind several NATs. Some NATs are more complicated than others, which is why TURN is required. Some NATs are too complicated even for TURN, but those are rare.

The TURN server should ideally sit on a sufficient machine. I ran that XTurn server (written in Elixir) on a cheap $5 VM and got better, faster throughput than Google’s own TURN servers. However, that was a handful of connections. If you plan on supporting thousands of simultaneous TURN connections, you’ll want a nice big VM with a chunky bandwidth.

For demos, I would strongly urge you to take a look at Muaz Khan’s experiments, here. There are loads available and they’re all bare-bones, so it’s a good resource to learn from.

Regards,
Jahred

lud · July 2, 2019, 3:44pm

Unfortunately this website seems to be down. I’ll watch if it comes back.

cheap $5 VM … However, that was a handful of connections. If you plan on supporting thousands of simultaneous TURN connections, you’ll want a nice big VM with a chunky bandwidth.

Not at all.

To me this is a toy project. It could become a production app, but only for 60 people with maximum maybe 3 concurrent video-calls (we will help therapists to help those people with some problems and most of them are not tech-savvy so we want all in the same website).

So I guess any machine would work then. I’ll start to experiment with Xturn.

Thank you !

(Edit: The website was not down and accessible through a VPN, but not directly with my ISP )

LSylvester · July 2, 2019, 3:57pm

Okay, great. If you like, you could always sign up for a FREE account with my company, Xirsys. We’re always happy to help anyone get up and running with WebRTC and that would cover your signalling and TURN requirements, while you develop.

lud · July 2, 2019, 4:29pm

Nice Thanks !

Edit: the docs are great.