Why is Registry local only?

I am writing an application that uses pub sub between various processes quite heavily. I wrote it at first with the local Registry in elixir and it works great. But eventually I needed processes on a different nodes to subscribe to this data, so I had to look for an alternative.

I looked at gproc and syn and I was not too happy with them. gproc’s api is horrible and barely documented and I can’t even get it to work in global mode. It would be an extremely tough sell to my coworkers. Syn has all sorts of collision detection ability and requires all nodes be connected before it can be initialized, or god knows what happens, and all of that is overkill. All of my registries are on one node, and other nodes just pass a message to them once in awhile. Edit: And after reading it more in depth, I’m not sure process groups are what I’m looking for.

So I started writing my own. After I was about half done I realized I was writing elixir’s local Registry myself but worse, and the only real change that mattered was that I could pass in a name: {:global, :name}, into all of my versions of its functions. Registry makes every possible attempt to prevent you from doing that.

At this point I’m seriously considering copying elixir’s registry and just changing some functions to accept a global name tuple, and the only thing stopping me is that I’m wondering what trouble they were trying to prevent me from causing when they wrote this module in this fashion. So if someone could tell me why what I’m considering doing is a bad idea that’d be great.

3 Likes

Just use gproc. If you don’t like the API, wrap the parts you don’t like in a better abstraction

Admittedly, my only experience with gproc is using this part:

2 Likes

Ultimately, gproc is the answer. It has already solved the problem (and you will end up running into problems it has already solved). Take the functions you already created and use them as wrappers around the gproc calls.

3 Likes

The simple (and boring) answer is that distributed registry is really, really hard to get right. The OTP team is already working on improving what’s available in Erlang itself, so there’s no need to duplicate efforts.

4 Likes

I realize that gproc has richer features but I still don’t see the showstopper for using :global.register_name/2 (with :global.whereis_name/1 or {:global,term}) in the meantime - obviously I’m missing something.

1 Like

Are you just using Registry like a pubsub system?

If you’re using it for a PubSub then you are probably running a :duplicate registry, in which case there are better ways in handling Broadcasting across nodes than using Registry.dispatch.

If you want to build it yourself and still use most of your Registry implementation then what you need to do is have a Broadcaster at each node that is part of a :pg2 group or something. Then this Broadcaster is actually in charge of dispatching messages.

When it receives a local message it broadcasts the message to the other Broadcaster processes in the :pg2 group then runs Registry.dispatch/3 locally. When it receives a message from another node it just runs Registry.dispatch/3 locally.

That’s basically what phoenix_pubsub does. If you need something more than just PubSub that Registry provided then I would still recommend a per node Orchestrator process of some kind.

Once again though, this is for :duplicate registries. If you are using a :unique registry then @michalmuskala comment about registration being hard becomes the plain fact of the matter.

3 Likes

I am using :duplicate, and I am using it for pub sub, but not every process gets the same data when an event fires. When they subscribe they send along an mfa, and then when an event occurs, the process takes its potentially large state and runs the mfa on it to summarize it into a smaller summary that goes to the subscriber. This is because the summary is different for each subscriber due to permissions or timezone of that particular user. Like if a user needs to get a count of calls that were presented to him since midnight in his timezone. Versus another user who needs any call at his company since midnight in some other timezone.

That means that I can’t just use a dumb Broadcaster because I would have to send the entire state of each process to every node before it could be summarized, with no guarantee that any process even subscribed to that data.

As for gproc, I’ve messed with it several times and I cannot get it to work. It seems to have what I need but when I try to use any global function I get the following error.

** (ErlangError) erlang error: :local_only

And I just don’t know what that means. If I could get past that I think I could maybe make this work.

2 Likes

This maybe?
[erlang-questions] How do I start gproc in a global way?

1 Like

I think maybe you’re using an :l atom where a :g atom should be used instead.

But the way your constructing your system seems interesting. I probably would have done it the other way around, and have the map/reduce offloaded onto the subscribing processes.

Either way, if you can give me an example of the call your making that throws the error I may be able to tell you what’s going on.

1 Like

A very quick rundown of your options:

  • Registry is local only
  • gproc has a global mode but the consensus is that it is unreliable. so consider it local only.
  • you can use pg (part of OTP) if you need a distributed process group (duplicate keys)
  • you can use global (part of OTP) if you need a distributed process registry (unique keys)
  • you can also use syn - you seem to already be aware of its pros/cons
  • you can also use Phoenix.Tracker, which is part of the phoenix_pubsub project, as a distributed process group (duplicate keys)

EDIT: Updated in 2023 to mention pg instead of pg2 in Erlang/OTP.

19 Likes

This is probably why I was having a problem with gproc. I guess I’ll fool with it and see if I can’t get it working. If gproc defaults to local only mode that would certainly explain why it wouldn’t work.

For those that have used gproc, what happens when you need to add an additional node? Would it just know about that node the first time a process on the node tried to use it?

1 Like

gproc - I hope it is somewhat reliable. I don’t need it to be perfect.
pg2 - It doesn’t store metadata with the subscription.
global - not appropriate
syn - I just looked at it again and I don’t know why I wrote it off earlier. I need dive into the code to be sure.
pheonix tracker - I’ll look at it, I hadn’t heard about it until this thread.

One thing I do not know, since both syn and gproc both need all of the nodes before they are started, what happens when you want to add a new node? Is there some way to reinit or specifically add a node to these without losing their state?

1 Like

Probably not, I think it has to do with the different ways gen_leader is implemented in those libraries. There probably isn’t a replication mechanism for those implementations. Moreover, these all need leaders because they have unique registration capabilities. With :global if two names conflict you can provide a conflict resolution function, but it just kills one randomly by default.

I my opinion, Phoenix.Tracker would be your the best for your use case. I wouldn’t have thought about it as a registry but it really is. The way it works is that instead of having to sync with a leader, each node has it’s own presence registry that pushes out a delta from its local changes to other nodes.

It has a way to duplicate the state to new nodes. So adding nodes at runtime doesn’t require new leader election and such.

3 Likes

Thank you so much! I have a much clearer picture of where I stand now.

1 Like

There is also lasp_pg

1 Like

Sorry for the late reply, I’m not active on these forums.

FYI and anyone interested, contrary to what stated here above Syn does not require all nodes to be connected before it can be initialized. The only requirement is that a node joining a cluster first connects to it before calling syn:init/0. This of course does not mean that all nodes need to be connected before syn can be initialized. Example code is in the README.

So yes, it is absolutely possible to add nodes to a running cluster and everything will work fine.

you can also use syn - you seem to already be aware of its cons

@josevalim would you mind expanding? I didn’t understand what cons you are referring to and I’d be happy to take yours and any feedback that could make it better.

Best,
r.

5 Likes