How to make processes that persistently map to stored data?

I want to create an application that contains a ‘society of agents’. Each of these processes has its own data and manages it.

However, when the application is shut down, I want to persist the states of the processes, and ensure that they will continue working (e.g. a similar process-hierarchy is started with the same data) when the application is started again.

Some of the processes contain references to other processes. Right now these are PIDs. Obviously, these cannot/are not useful to be persisted. Registering processes under some entity-specific ID that would be persisted is possible, but as atoms are not garbage-collected, I am not sure if it is the proper solution.

Is there a better way to do this?

1 Like

Assuming you are using GenServers, you may register your processes under any term (instead of just atoms),
here is the relevant part of the GenServer docs:

Name Registration

Both start_link/3 and start/3 support the GenServer to register a name on start via the :name option. Registered names are also automatically cleaned up on termination. The supported values are:

an atom - the GenServer is registered locally with the given name using Process.register/2.

{:global, term}- the GenServer is registered globally with the given term using the functions in the :global module.

{:via, module, term} - the GenServer is registered with the given mechanism and name. The :via option expects a module that exports register_name/2, unregister_name/1, whereis_name/1 and send/2. One such example is the :global module which uses these functions for keeping the list of names of processes and their associated pid’s that are available globally for a network of Erlang nodes.

So you could do something like:

:global example
defmodule ElixirForum.ExampleAgent do
  use GenServer
  
  def start_link(agent_id) do
    GenServer.start_link(__MODULE__, nil, name: {:global, {:example_agent, agent_id}})
  end
  
  def whereis(agent_id) do
    :global.whereis_name({:example_agent, agent_id})
  end
  
  def do_something(agent_pid, data) do
    GenServer.call(agent_pid, {:do_something, data})
  end
end

And use it like:

some_agent_id
|> ElixirForum.ExampleAgent.whereis
|> ElixirForum.ExampleAgent.do_something(some_data)

If you dont want to register the process globally you can use via tuples. They work the same way, you simply have to provide a module that exports the appropriate functions for handling process registration (listed in the docs), such as :gproc

:gproc actually requires keys to be triplets with the following form: {type, scope, key} In the example i use :n for type which indicates a unique registration (only one process with the given key) and :l for scope which indicates a local registration.

via_tuple example
defmodule ElixirForum.ExampleAgent do
  use GenServer
  
  def start_link(agent_id) do
    GenServer.start_link(__MODULE__, nil, name: {:via, :gproc, {:n, :l, {:example_agent, agent_id}}})
  end
  
  def whereis(agent_id) do
    :gproc.whereis_name({:n, :l, {:example_agent, agent_id}})
  end
  
  def do_something(agent_pid, data) do
    GenServer.call(agent_pid, {:do_something, data})
  end
end

Usage remains the same:

some_agent_id
|> ElixirForum.ExampleAgent.whereis
|> ElixirForum.ExampleAgent.do_something(some_data)

Also, you might want to check out pg2 for handling process groups if that makes sense in your application :wink:

7 Likes

It is great to find out that there is an easy way to do this. :heart_eyes: Every time I think something might be hard, Elixir/Erlang has a simple solution.

Thank you for your amazingly long and clear answer!

2 Likes

You’re welcome! Glad you found it useful :blush:

2 Likes