Realtime collaboration

tmbb · October 31, 2017, 12:09am

I’ve been thinking of realtime collaboration on Phoenix. The most obvious case is text editing (both rich text and plain text) bit there is a lot of potential for rocher datatypes, like JSON documents.

SharDB is a Javascript framework running on NodeJS that allows for real-time collaboration in editing JSON documents, with facilities for text editing as a special case. It uses Operational Transformations (OT) with a client-server architecture (not peer-to-peer, that doesn’t really work very well in real life). I think it would be cool to have this in Phoenix.

Currently, there are Elixir libraries that handle OT for plaintext and rich text, although none of those libraries ship with a production-quality server or client. The library for plain text OT does ship with a server, but not with a client, and the implementation of the server is not complete.

There isn’t yet an implementation of OT for JSON documents in Elixir, but the implementarion used by ShareDB seems easy to port (the transformation function is quite stupid and inefficient, but it seems to get the job done).

Instead of OT, some people advocate CRDTs, but they have some important disadvantages in practice. They have a much higher memory overhead (and sometimes a higher bandwidth overhead) and they don’t resolve conflicts. OT with a central server has low overhead and gives us a canonical document (the one that lives on the server) at each moment in time.

This leaves us with the taks of:

Writing a client using something like Phoenix channels
Implementing the network protocols. It’s possible to opimize them a lot when compared to what ShareDB does. ShareDB uses JSON, which for small events like keypresses wastes bandwidth like crazy. Some easy savings could be achieved by using MsgPack instead of JSON, but I suspect we can do better with even more compact transmission formats.
Porting OT for JSON documents from ShareDB. This is important because JSON can describe a lot of types of variable length documents.
Writing a generic OT server onto which one can plug the different OT types. This part is independent from phoenix, even if it ends up using Channels to communicate with the outside world.

What would this be useful for? Well, first for “normal” collaborative editing. Think of a document that can be edited by more than one person. Or even a JSON tree of documents that can be esited by more than one person.

But there are other advantages. Think of something like Drab. Drab.Live makes it possible to sync client and server. But it can get you in problems if you edit the state on the client and server concurrently. You get the semantics not of collaborative editing but of Last Write Wins, which is not desirable. If you have OT, you can edit the state on the client and server concurrently ( the fact that other users can esit the state too is a just a nice bonus). This could pave the way for real isomorphic apps in which the state is shared by the client and server and can be operated on by both of them.

Elixir makes it easy to deal with the kinds of servers required here (just spin up a genserver per document and listen to operations, possibly persisting the oeprations womewhere, like an ETS table). The main obstacle here is the fact that a lot of code still needs to be written. I wonder of users here would like to collaborate on this.

houshuang · November 4, 2017, 2:46pm

Hi,

we’re building a collaborative learning platform in Meteor/React (FROG), where we’re using ShareDB heavily - both ot-json as the backing store for collaborative activities (activities are pluggable, and they get access to a shared document, which they can structure as they want - this gives us a lot of flexibility), and ot-text for collaborative text editing. We are having some problems scaling up (both because of Meteor and possibly ShareDB), and I dream of having a “OT as a service” thing, where I could just run an Elixir server or cluster, which would be compatible with the front-end ShareDB client (or something similar). All the JS code could be kept - and it would be super-fast, scaleable etc.

I did try a few years ago to rewrite the login in ot-text to Elixir, it was an interesting exercise… It’s pretty functional code (all functions which take an input and produce an output without side-effects), but they use closures a lot, which I changed to recursive functions. I even set up a way to run the (very extensive) JS test harness for ot-text against my library - and it got pretty far. I never fixed all the failing tests, and I kind of abandoned it, because I didn’t have the push to move it forwards (I was in the middle of running experiments for my PhD thesis, and didn’t actually need it. My very old abandoned attempt.

I still think OT is amazingly powerful, and it seems tailor-made for Elixir… I would love to see someone try again to make a production-ready OT server in Elixir - having a compatibility mode with ShareDB would let a lot of products switch easily, and would work well with existing extensions like Rich text etc, but investigating a more efficient transmission format would be interesting too…

Note that the ShareDB people have been working on a new version of ot-json for years, which apparently is soon ready.

If you pursue this, please let me know, as I’d be very interested to follow along, and maybe use it. We’re also working on collaborative writing analytics (predict collaboration quality etc), if anyone is interested.

best
Stian Håklev
Ecole Polytechnique Fédérale de Lausanne

houshuang · November 4, 2017, 2:48pm

More about how we use ShareDB here https://groups.google.com/d/msg/sharejs/N7QBY-qI2O4/EhFPxF2GAgAJ

tmbb · November 4, 2017, 4:33pm

I don’t think it’s possible to do better than what they already have. They are talking about adding conflict markers, which destroys the ergonomics completely.

tmbb · November 5, 2017, 1:04pm

I haven’t read your account of using ShareDB in production yet. I need to dedicate a solid block of time to that because it’s packed with useful information.

tmbb · November 5, 2017, 10:35pm

It seems like you’re having lots of problems with the transformation functions. I wasn’t expecting them to be a bottleneck. Did you investigate CRDTs (especially delta-CRDTs)? You’d have (probably much) higher memory requirements, but “transformations” would be quicker. You’d have problems rendering the underlying CRDT into JSON, but that work can be offloaded to the clients (the server might not even need to render anything).

houshuang · November 6, 2017, 4:18am

tmbb: Sorry, what is the context of your comment? Are you talking about our
issues with scaling ShareDB, or with my initial attempt at translating
ot-text into Elixir? If it’s the first one, I think a bigger concern is the
number of websockets that can be connected to a ShareDB instance at a time,
not so much what they are doing (we haven’t really found this to be a
bottleneck at least). The one problem we had was efficiently creating
hundreds of different documents at a time, but this is a single create
operation, it’s more because of the asynchronous nature of creating a
document from a ShareDB client. If ShareDB had a command (or we built an
extension into the client), which could take as an argument a dict of
objectid: content, and batch-create all the new documents, it would be
super-fast (and already it’s decent for us).

I have been reading a bit about CRDTs etc - they seem more appropriate for
peer2peer stuff, and not so necessary where you will anyway have a central
server? But very interested in all developments in this field. For CRDTs, I
generally see research papers and prototypes, but not much production-ready
libraries, etc.

OvermindDL1 · November 6, 2017, 5:46pm

It sounds awesome, though I’m short on time currently (I’m much more free during holidays ^.^).

tmbb · November 6, 2017, 5:53pm

The first.[quote=“houshuang, post:7, topic:9736”]
If it’s the first one, I think a bigger concern is the
number of websockets that can be connected to a ShareDB instance at a time,
not so much what they are doing (we haven’t really found this to be a
bottleneck at least).
[/quote]

That’s better. I believe I’ve seen some benchmarks that show that Cowboy and Phoenix are probably better than Node at holding many connections.

To me it’s not a question of being “necessary”, it’s just that they’re much easier to understand usually xD Performance characteristics in a client-server architecture probably lag behind OT in all metrics (the memory and network overhead of all CRDTs I’ve seen for text is quite high).

My main problem with CRDTs is that it’s hard to implement an “intermediate” level of structure. Low structure, like a freeform graph? Nice, CRDTs got you covered. Text editing (editing a chain of characters), again, easy. Editing trees (and enforcing the fact that it remains a tree)? Acyclic DAGs? You’re out of luck…

rawkode · December 20, 2017, 7:28pm

@tmbb Did you start anything with this? I’m looking at ShareDB and was trying to work out if this is something I could do in reasonable time with Elixir

tmbb · December 23, 2017, 2:29pm

I’ve started something but didn’t really go very far… Indon’t have my laptop with me for christmas, otherwise I’d send you what I had…

But it’s not much anyway. There are some Elixir packages that implement parts of the solution.

tmbb · February 19, 2019, 7:41pm

I’ve just been reading the ShareDB binary wire protocol, and it seems like a poor match for Phoenix. Phoenix channels seem to work pretty well if there is a one-to-one map between channels and documents. That is, a single channel should handle a document. That way, subscribing to a given document is trivial: just “join” the channel in question. Keeping binary compatibility with SharDB would require defining my own incompatible channel abstraction on top of Phoenix’s PubSub, which doesn’t seem very easy.

I think I’ll playy a little with having one channel per document and see where it goes.

rodrigues · February 19, 2019, 9:45pm

In this area of collaboration, I’m looking forward to see things like Swarm and RON getting traction, looks like it can address some of the overheads of CRDTs, and be more offline-sync friendly than OT solutions.

Something like this for Draft.js would be amazing.

houshuang · February 20, 2019, 4:47am

You could do this translation on the client - the ShareDB client library doesn’t require access to a raw websocket, but to a websocket-like object, which only needs to implement a few minimal methods. This would still require a tiny bit of overhead, but now it would be distributed among all the clients, which is not where the bottleneck is.

I still think you’d get a ton of leverage if you built on ShareDB, there are a lot of “devil is in the detail” which they have solved over years, lot’s of libraries with built-in support etc. Here’s a recent tech talk I gave going over how we use ShareDB and React in detail: https://www.youtube.com/watch?v=gN37rJRmISQ

And here is a demo of the kinds of things we’re currently capable of: https://www.youtube.com/watch?v=zuUG9a5tiVM

This is all open source (and I’m happy to help if you’d like to reuse any of our components). I’m currently working on building a wiki/knowledge base on top of these components, with live editing, rich components etc. I’d love to be able to put this up on a public server and offer everyone free accounts, and that would be so much easier with a reliable easy-to-scale backend like Elixir.

(In fact, I’d be happy to write the client code required to get ShareDB to interop with native Elixir websockets, if we had the server-side code to actually do the operational transforms etc).

tmbb · February 21, 2019, 12:53am

I know, that seems like the easy part. As usual, you trade complexity in implementing the client in exchange for complexity when implementing the server. If you have a phoenix channel per document, you can simply broadcast! your operations to all clients that subscribe to that document. If you share a channel between multiple documents things get much harder.

I don’t know how to route the raw binary ShareDB messages into their own channels. Unless I define a websocket-like object that parses the message and translates them into channel subscriptions and events. Which canbe done, of course, but it seems very wasteful,

houshuang · February 21, 2019, 8:18am

This is what I was suggesting. I don’t actually think it’s very expensive, and given that it’s running on the client, it doesn’t matter for scaling. My concern for scaling is having a single beefy server which could serve thousands of users, in this situation a tiny bit of overhead distributed in each browser makes no difference. And compared to all the overhead of sending and receiving websockets etc, I am confident this wouldn’t even be noticeable, and not a lot of code.

The other question is whether the workflow you’re suggesting of broadcasting all changes works with ShareDB… I think it might - a client submits some changes, the server applies those changes and broadcasts, the original client updates its own version, rebases any pending changes onto the new changes (which could include also updates from other clients), other clients also update their local version, submit their changes etc… Could work, but I would have to look more at exactly how the ShareDB protocol works first.

tmbb · February 25, 2019, 9:03pm

I’ve been reading a little more, and it looks like Phoenix supports custom transports in channels. That means I can write a new transport that receives websocket messages just like the default ShareDB client and dispatches it into the Phoenix PubSub adapters. That might allow me to efficiently dispatch the messages to the correct clients

houshuang · February 25, 2019, 9:21pm

Let me know if you ever want to Skype about this or if I can help in any way. My Elixir is quite rusty, but I’m quite familiar with ShareDB and the client-side stuff as well as JS in general.

tangui · July 25, 2019, 10:48am

How about XML patches for trees? I don’t much about collaborative editing though. There are a few research papers on that when typing “collaborative editing xml” but I’m not sure what are the pros and cons of this approach.

Implement it with Elixir could make use of:

XML patch of the document changes on the client (JS lib), then sending it to an API
Xqerl on the server to apply the patch, including detecting (and resolving?) conflicts
Liveview to render back the document (be it a text document, an SVG whiteboard…)

tmbb · July 25, 2019, 12:43pm

You can think of XML as a strict subset of JSON. That means that whatever works for collaborative editing of JSON documents will also work for colaborative editing of XML documents.