Realtime collaboration



I’ve been thinking of realtime collaboration on Phoenix. The most obvious case is text editing (both rich text and plain text) bit there is a lot of potential for rocher datatypes, like JSON documents.

SharDB is a Javascript framework running on NodeJS that allows for real-time collaboration in editing JSON documents, with facilities for text editing as a special case. It uses Operational Transformations (OT) with a client-server architecture (not peer-to-peer, that doesn’t really work very well in real life). I think it would be cool to have this in Phoenix.

Currently, there are Elixir libraries that handle OT for plaintext and rich text, although none of those libraries ship with a production-quality server or client. The library for plain text OT does ship with a server, but not with a client, and the implementation of the server is not complete.

There isn’t yet an implementation of OT for JSON documents in Elixir, but the implementarion used by ShareDB seems easy to port (the transformation function is quite stupid and inefficient, but it seems to get the job done).

Instead of OT, some people advocate CRDTs, but they have some important disadvantages in practice. They have a much higher memory overhead (and sometimes a higher bandwidth overhead) and they don’t resolve conflicts. OT with a central server has low overhead and gives us a canonical document (the one that lives on the server) at each moment in time.

This leaves us with the taks of:

  1. Writing a client using something like Phoenix channels

  2. Implementing the network protocols. It’s possible to opimize them a lot when compared to what ShareDB does. ShareDB uses JSON, which for small events like keypresses wastes bandwidth like crazy. Some easy savings could be achieved by using MsgPack instead of JSON, but I suspect we can do better with even more compact transmission formats.

  3. Porting OT for JSON documents from ShareDB. This is important because JSON can describe a lot of types of variable length documents.

  4. Writing a generic OT server onto which one can plug the different OT types. This part is independent from phoenix, even if it ends up using Channels to communicate with the outside world.

What would this be useful for? Well, first for “normal” collaborative editing. Think of a document that can be edited by more than one person. Or even a JSON tree of documents that can be esited by more than one person.

But there are other advantages. Think of something like Drab. Drab.Live makes it possible to sync client and server. But it can get you in problems if you edit the state on the client and server concurrently. You get the semantics not of collaborative editing but of Last Write Wins, which is not desirable. If you have OT, you can edit the state on the client and server concurrently ( the fact that other users can esit the state too is a just a nice bonus). This could pave the way for real isomorphic apps in which the state is shared by the client and server and can be operated on by both of them.

Elixir makes it easy to deal with the kinds of servers required here (just spin up a genserver per document and listen to operations, possibly persisting the oeprations womewhere, like an ETS table). The main obstacle here is the fact that a lot of code still needs to be written. I wonder of users here would like to collaborate on this.



we’re building a collaborative learning platform in Meteor/React (FROG), where we’re using ShareDB heavily - both ot-json as the backing store for collaborative activities (activities are pluggable, and they get access to a shared document, which they can structure as they want - this gives us a lot of flexibility), and ot-text for collaborative text editing. We are having some problems scaling up (both because of Meteor and possibly ShareDB), and I dream of having a “OT as a service” thing, where I could just run an Elixir server or cluster, which would be compatible with the front-end ShareDB client (or something similar). All the JS code could be kept - and it would be super-fast, scaleable etc.

I did try a few years ago to rewrite the login in ot-text to Elixir, it was an interesting exercise… It’s pretty functional code (all functions which take an input and produce an output without side-effects), but they use closures a lot, which I changed to recursive functions. I even set up a way to run the (very extensive) JS test harness for ot-text against my library - and it got pretty far. I never fixed all the failing tests, and I kind of abandoned it, because I didn’t have the push to move it forwards (I was in the middle of running experiments for my PhD thesis, and didn’t actually need it. My very old abandoned attempt.

I still think OT is amazingly powerful, and it seems tailor-made for Elixir… I would love to see someone try again to make a production-ready OT server in Elixir - having a compatibility mode with ShareDB would let a lot of products switch easily, and would work well with existing extensions like Rich text etc, but investigating a more efficient transmission format would be interesting too…

Note that the ShareDB people have been working on a new version of ot-json for years, which apparently is soon ready.

If you pursue this, please let me know, as I’d be very interested to follow along, and maybe use it. We’re also working on collaborative writing analytics (predict collaboration quality etc), if anyone is interested.

Stian Håklev
Ecole Polytechnique Fédérale de Lausanne


More about how we use ShareDB here


I don’t think it’s possible to do better than what they already have. They are talking about adding conflict markers, which destroys the ergonomics completely.


I haven’t read your account of using ShareDB in production yet. I need to dedicate a solid block of time to that because it’s packed with useful information.


It seems like you’re having lots of problems with the transformation functions. I wasn’t expecting them to be a bottleneck. Did you investigate CRDTs (especially delta-CRDTs)? You’d have (probably much) higher memory requirements, but “transformations” would be quicker. You’d have problems rendering the underlying CRDT into JSON, but that work can be offloaded to the clients (the server might not even need to render anything).


tmbb: Sorry, what is the context of your comment? Are you talking about our
issues with scaling ShareDB, or with my initial attempt at translating
ot-text into Elixir? If it’s the first one, I think a bigger concern is the
number of websockets that can be connected to a ShareDB instance at a time,
not so much what they are doing (we haven’t really found this to be a
bottleneck at least). The one problem we had was efficiently creating
hundreds of different documents at a time, but this is a single create
operation, it’s more because of the asynchronous nature of creating a
document from a ShareDB client. If ShareDB had a command (or we built an
extension into the client), which could take as an argument a dict of
objectid: content, and batch-create all the new documents, it would be
super-fast (and already it’s decent for us).

I have been reading a bit about CRDTs etc - they seem more appropriate for
peer2peer stuff, and not so necessary where you will anyway have a central
server? But very interested in all developments in this field. For CRDTs, I
generally see research papers and prototypes, but not much production-ready
libraries, etc.


It sounds awesome, though I’m short on time currently (I’m much more free during holidays ^.^). :slight_smile:


The first.[quote=“houshuang, post:7, topic:9736”]
If it’s the first one, I think a bigger concern is the
number of websockets that can be connected to a ShareDB instance at a time,
not so much what they are doing (we haven’t really found this to be a
bottleneck at least).

That’s better. I believe I’ve seen some benchmarks that show that Cowboy and Phoenix are probably better than Node at holding many connections.

To me it’s not a question of being “necessary”, it’s just that they’re much easier to understand usually xD Performance characteristics in a client-server architecture probably lag behind OT in all metrics (the memory and network overhead of all CRDTs I’ve seen for text is quite high).

My main problem with CRDTs is that it’s hard to implement an “intermediate” level of structure. Low structure, like a freeform graph? Nice, CRDTs got you covered. Text editing (editing a chain of characters), again, easy. Editing trees (and enforcing the fact that it remains a tree)? Acyclic DAGs? You’re out of luck…


@tmbb Did you start anything with this? I’m looking at ShareDB and was trying to work out if this is something I could do in reasonable time with Elixir :smile:


I’ve started something but didn’t really go very far… Indon’t have my laptop with me for christmas, otherwise I’d send you what I had…

But it’s not much anyway. There are some Elixir packages that implement parts of the solution.