Websocket Shootout: Clojure, C++, Elixir, Go, NodeJS, and Ruby

Hi all
Do anyone read this article?


It’s interesting that Phoenix’ Channels and Rails’ ActionCable were used but the “lowest level” Node WebSockets implementation was used and not Sock or Socket, which are more like Phoenix (a higher level abstraction over the raw web sockets), would probably halve the code size.


Probably would add another overhead as well and javascript was already not that fast in comparison to most. ^.^

Lies, Damn Lies and Benchmarks…


Then again, everyone using WebSockets in Node seriously will use clusters across many cores which can perform extremely well.

Not trying to start a language war though, just saying :stuck_out_tongue:

Very impressive especially considering the overhead of channels vs raw web sockets on par with Go.


It would be quite interesting to see how performance scales as concurrency is increased on the client side.


At the very least, this post needs to include the following points:

  • Phoenix Channels is a higher-level abstraction over raw WS. We spawn isolated, concurrent “channels” on the underlying WebSocket connection. We monitor these channels and clients get notified of errors. This contributes to overhead in both memory and throughput, which should be highlighted with how Phoenix faired in the runs
  • Phoenix channels runs on a distributed pubsub system. None of the other contestants had a distribution story, so their broadcasts are only node-local implementations, where ours is distributed out of the box

Phoenix faired quite well in these runs, considering we are comparing a robust feature set vs raw ws/pubsub implementations.


Could you please exlpain me what is a distributed pubsub system?
Thanks a lot

It probably means that you can have few server with phoenix working seamlessly in a cluster without much hassle. If you want to add another server you just do it, and PubSub with it’s Presence cares about consistency between nodes, and offer this just out of the box. And neither Go nor Node’s implantation does not offer this.


The memory comparison is very misleading. The memory comparison should be listed as memory/connection at the very least. The only reason Rails memory utilization is that low is because its handling less than 1k connections vs other stacks getting above 20-30k.


Also, isn’t this running in development mode? They mention to view the specific README for instructions on how they ran their benchmarks, which by default just has them running:
mix phoenix.server.

1 Like

You have really good points on the underlying work that is being done by Phoenix concerning websockets. We are aware the comparisons are not exactly apples to apples. One of the larger points of the blog post was to shed some light on the idea that yes, you can get more performance from X, but at what cost. Because development effort is a huge consideration. But Phoenix does appear to have the highest performance for the amount of work.

That being said, websocket shootout is still in it’s infancy and we plan to add more implementations and expand upon the explanations of each. We’re thinking this could become an ongoing benchmark topic that we update as we learn more. Perhaps we’ll place more emphasis on the performance to development effort ratio in the future.


Phoenix PubSub is setup to be distributed across multiple connected nodes out of the box using a pg2 group.

The server process on each node will just broadcast to the subscribers on the node and will send a message to server processes on other nodes telling them to forward a broadcast to its local processes. All of this is setup out of the box just by connecting multiple nodes under single otp application.

1 Like

Yeah, I finally got around to posting my comments. Will be interesting to see if I get (uninformed) push back from node fans…

OK, first off, very nice article; it’s even-handed and covers the technologies equally to the depth which you decided to go. Now, a few comments:

  1. As someone else pointed out, the comparison of code size between Elixir & Node is confusing. I suspect that it’s because you compare “The entire relevant channel code…” with “The overall application…”, and if that’s the case, it would be good to clarify that–if, for example, there’s support code in Phoenix that is generated that makes the app larger, but that you didn’t discuss because you felt that minimal boilerplate didn’t affect the complexity experienced by the dev, take a sentence or two and state that.
  1. It would be good to explicitly point out that the low memory usage of Ruby & Node solutions is likely largely due to the low number of connections they can handle. It is clearly implied if one thinks about it–but best to be explicit. Also, in a future update it might be good to run each server at 1/2 the connections it can handle, take that memory consumption, and derive rough estimates of overhead to run the server + memory per connection.
  1. I knew that it would be no time before node.js groupies showed up to claim that you could get similar performance “just” by running multiple instances. This is a bogus claim for two reasons: first, running multiple instances is a standard solution for web apps which are either stateless or store their state in a cookie or db, for this kind of app you would have to add interprocess communication between node instances to keep the set of channels in synch and to share messages, which would hugely complicate the code so that it would no longer have any simplicity advantage; second, the overhead of interprocess communication for every channel operation would greatly impact performance (and would, in fact, be O(n^2) in the number of node instances). It would be good to point this out in the article, heading off the mistaken belief by many readers that there is an easy solution to node’s (or Ruby’s) poor performance.
  1. The Phoenix solution scales across multiple servers, with no additional code. Similar argument as for multiple node.js instances; communications overhead probably means this is only useful for a small number of servers. (Note: O(n^2) in the number of servers is a heck of a lot better than O(n^2) in the number of cores.) But still, scaling across multiple servers with no additional code!
  1. The elephant in the room: the article really needs to look at a Java solution. I know Java’s not “cool” (and I don’t use it either), but it has a huge (albeit quiet) presence in the world of truly web-scale applications.