Correct tool for implementing shared state in a kubernetes cluster

I have a live view that allows for editing a form in my application that I want to have shared across all editors of the form. Currently in local development I have this working by creating a genserver per form and maintaing the state of the form there. However as soon as i deploy this to production the application is now running on 3 different boxes within kubernetes.

Is swarm or horde the correct tool for getting my genserver to work across the distribution or is there a better way to accomplish this?

You should probably checkout Phoenix PubSub, https://github.com/phoenixframework/phoenix_pubsub

I think Live View can use it natively, and it’s using fancy CRDTs under the hood.

Also, if you’re using k8s already, checkout libcluster, and the k8s plugin, https://github.com/bitwalker/libcluster for automatic node clustering.

1 Like

LiveView does use PubSub, although you will either need to network your nodes together or use the Redis implementation if you’re not setup with it already. PubSub itself does not use any fancy CRDTs and you could actually run into problems with users editing the same form at the same time.

If you don’t need to solve the multi-editor issues with a form, then PubSub will work great. Just broadcast changes, have LiveView subscribed to those, and then update the relevant state. If you need to support multi-editing, then you’d use some form of CRDT or coordinated locking across the cluster. PubSub could still be used here, as the need for a CRDT is separate than the communication of the updates.

1 Like

I hate to disagree with the author of the book on phoenix real-time, but Phoenix.Tracker — Phoenix.PubSub v2.1.3

Tracker shards use a heartbeat protocol and CRDT to replicate presence information across a cluster in an eventually consistent, conflict-free manner.

How are you disagreeing? Phoenix Pubsub does not use CRDTs. The Tracker does, but the tracker is built on top of pubsub, it is not part of pubsub.

2 Likes

Fair enough, it’s in the PubSub library though, but I guess you have to implement it separately.

Thanks Ben. Yes, it is true that Tracker uses CRDT. However, that doesn’t really help here because the particular problem posed by @jamesblack is about communication (and possibly about a text-type CRDT). Phoenix.PubSub (module not the library) is excellent for that problem, though.

This is a good point. It’s confusing to say it has nothing to really do with PubSub (at least as I see it), but that is the case here. I think the confusion came in by me referring to the PubSub module, and not the phoenix_pubsub library.

So digging in a bit, I thought the meta would provide the ability to share the form data through a CRDT, but I don’t think it goes that deep into the presence objects.

Depending on your situation, you could potentially “hack” the presence Tracker to have a presence for each field in your form, and get the benefits of the CRDT.

You won’t get a benefit from the CRDT here. The issue with that is that the CRDT doesn’t merge values together. Instead, it focuses on the entire meta chunk being authoritative and the merge happens on the list of metas in a topic. Each tracked process gets a random reference (phoenix_pubsub/lib/phoenix/tracker/shard.ex at v1.1.2 · phoenixframework/phoenix_pubsub · GitHub) that identifies it.

If you try to approach the metadata as a way to do a text-based CRDT, it just won’t work. You will probably be better off with just dispatching PubSub messages in that case.

edit: Found an old thread discussing real-time collaborative text options (Realtime collaboration - #4 by tmbb)

I don’t think this will be super useful for OP, who probably just needs PubSub. (keep it simple)

1 Like

As it stands we currently just rely on the mailbox of the genserver process to deal with multiple people editing the same data. Last change in trumps everything, which works for us. Does pubsub cross machine boundaries, or will it with some additional configuration?

Each node needs to be configured to join a cluster. If you’re using Kubernetes, I’d suggest the aforementioned libcluster library which makes that trivial to do with K8s. After that, pubsub across machines will just work.

2 Likes

Does the same apply to dynamically created genserver state? My biggest fear is before the message gets sent that when it goes to tell the genserver tracking the “source of truth” it will create 1 genserver per node. I’m pretty new to elixir so it wouldn’t surprise me if once it was “clustered” that the genserver stuff also just worked.

A genserver is a single process, and a single process is always on one node or another, it isn’t “spread across” a cluster. However, in your scenario, you could consider spawning a genserver with a global name, where the name is based on the conversation ID or some other value unique to what you’re trying to synchronize. Multiple clients on multiple nodes may try to start this, but only one will win. Then you have a guaranteed single genserver that all of the connections on all the nodes can talk to.

# GenServer
def via_tuple(host), do: {:via, Registry, {Registry.HostState, host}}

def start_link(host) do
  GenServer.start_link(__MODULE__, host, name: via_tuple(host))
end

# Dynamic Supervisor

def get_host_state(host) do
  unless pid_from_host(host), do: start_host_state(host)
  HostState.get_state(pid_from_host(host))
end

defp pid_from_host(host) do
  host
  |> HostState.via_tuple()
  |> GenServer.whereis()
end

Would this qualify for for the above? Would all 3 nodes resolve to the same pid when requesting the same host?

1 Like

Right, what I was saying was that the list of “presence” objects, if each “presence” was a field in your form, and not the user, you would get the benefit of the merge, the meta data could be the value of the field.

Like I said, it’s a HACK :slight_smile: but could potentially work. Probably better off using bare PubSub and a different CRDT implementation.

No, Registry is local node. YOu need to use {:global, name} as the name.

1 Like

This is misleading advice. During a netsplit, :global cannot provide this “guarantee” and will only discard one of the processes after the netsplit is healed. But while the nodes are partitioned, it’s entirely possible to end up in the exact scenario that @jamesblack is concerned with.

Perhaps a better way of putting this would be that the conversation in general hasn’t addressed the CAP trade off. All of the proposed solutions run into some sort of issue during netsplit.

@jamesblack can you provide information about the underlying storage mechanism? Is the form backed by the database?