How to gather information from multiple processes?

I have a worker process that lives for some time and contains some state. Worker processes can be spawned from multiple processes, so there is no single source that is the “parent” of these workers.

I want to show a dashboard of all the worker processes that are running and some information from their state. I want to show this all in a liveview, preferably on page load or soon after, so preferably all in one go, not as a page that slowly populates from a stream of data.

What is the best way for my liveview to gather all this information living in different processes?

My current solution is for the workers to regularly report their state to a gen server that keeps track of the workers, and the liveview calls this tracker process on mount. The tracker also listens for the death of workers so it can remove their info. Is there a better way to do this? This tracker feels like a single point of failure, and is also duplicating all the state that is already stored in the workers.

For the “which workers are there” question you can use Registry. I’d however also consider spawning those workers under a shared supervisor.

They are already registered with a registry, and started under a single dynamic supervisor. I can get a list of PIDs, but I want to get the state in the workers, not just the PIDs.

You can put the information you need as the key in the Registry.

Thanks! I’ll give it a try

I have some follow-up questions about the registry:

Since I want to look up all registered processes, I assume I need to use Process.lookup/2. This requires all the workers to register under the same key in a duplicate registry:

iex > {:ok, _} = Registry.register(RegistryName, "shared_key", state)

Lookup all the entries:

iex > Registry.lookup(RegistryName, "shared_key")
[{pid1, state1}, {pid2, state2}, ...]

So far so good. But how do I update the value when the worker state is changed?

There is Registry.update_value/4 but the docs say it can only be used on a unique registry, an error is raised if used on a duplicate registry.

But if I use a unique registry, how do I lookup all the values without knowing the keys?

Do I use a duplicate registry and when I want to update the state, I would unregister then register again immediately?

If you have a list of PID’s you can send them a message to get whatever you need from the state, right?

e.g.

GenServer.call(pid, :get_the_name_from_the_state)

or if for some reason you do not want to add the handle_call/3 for that to your GenServers you can use

:sys.get_state(pid)

You can use Registry.select for fetching everything registered in the registry. There’s no explicit API for that given this can be rather expensive depending on the number of processes registered.

1 Like

What does that mean? Can’t you just start workers under a single supervisor?

Yes, they are all started under one dynamic supervisor

So all of these workers actually have a single parent :slight_smile:
What information are you gathering? You can use Registry as @LostKobrakai pointed out, or you can just use DynamicSupervisor.which_children/1 and then calling every process, querying information you need. The right decision depends on the information you’re gathering

1 Like

The workers are genservers with a struct as the state. I want to fetch a subset of the struct fields from every worker.

  defmodule State do
    defstruct id: nil,
              map_id: nil,
              participants: [],
              status: :active,
              start_time: nil,
              ...more fields
  end

On the liveview I want to show the start time, status and participants of every worker process.

Participants is a list of some structs with >32 fields each

How often do you query this info? How often does this subset of state change? Do you want to see changes to this state in live view automatically or only on user-initiated update?

1 Like

This will be queried when the liveview mounts so whenever a user visits the page (hopefully many users in the future but currently just me), then receiving updates preferably as live as possible but it’s not strictly necessary, current implementation is doing 1 update per second and it’s good enough.

On the worker side, the genserver state is being updated 1 to 10 times a second, possibly higher in the future.

You could implement a get_state call and retrieve only the part of the state You are interested in.

And because You start with the same supervisor, You can have a list of all children processes, and derive state.

That’s okay, but it loads the workers

That’s a great use-case for phoenix pubsub. LiveViews subscribe to update events, and workers just publish the messages

It might make sense to introduce throttling in the workers. I mean publish only one update event in a second and only if a change occurs.