How to bootstrap one process with state from another?

toranb · October 23, 2018, 1:58pm

I’m working through a very simple tutorial of sorts and wanted to ask for a little guidance on my next step (related to process registration).

Right now I have a working app with 2 GenServers, 1 Supervisor and the Registry. Here is my Application module

defmodule EX.Application do
  use Application

  def start(_type, _args) do
    EX.Supervisor.start_link(name: EX.Supervisor)
  end
end

Here is the Supervisor

defmodule EX.Supervisor do
  use Supervisor

  def start_link(opts) do
    Supervisor.start_link(__MODULE__, :ok, opts)
  end

  def init(:ok) do
    children = [
      {Registry, keys: :unique, name: EX.Registry},
      EX.Cache,
      EX.Worker
    ]
    Supervisor.init(children, strategy: :one_for_one)
  end
end

The exercise for learning sake today is this … As my worker init runs I’d like to fetch the cache state. The tricky part that landed me here asking a question is “how do I ask for the cache (by name?) and allow the Registry lookup to do it’s thing?” note: I’m trying not to do an explicit start_link here in my worker because I want to understand better how I would interop w/ a process by name

Here is my init (worker)

defmodule EX.Worker do
  use GenServer

  def start_link(name) do
    GenServer.start_link(__MODULE__, :ok, name: via(name))
  end

  defp via(name), do: EX.Registry.worker(name)

  @impl GenServer
  def init(:ok) do
    {:ok, %{}, {:continue, :init}}
  end

  @impl GenServer
  def handle_continue(:init, _state) do
    # cached_state = ?
    # also, how should I reply from this to continue properly ?
    {:noreply, new_state, 2000}
  end
end

Here is my Cache (note: I realize this example is of no actual value architecturally speaking)

defmodule EX.Cache do
  use GenServer

  def start_link(name) do
    GenServer.start_link(__MODULE__, :ok, name: via(name))
  end

  defp via(name), do: EX.Registry.cache(name)

  @impl GenServer
  def init(:ok) do
    {:ok, %{}}
  end
end

Here is my simple Registry for reference

defmodule EX.Registry do
  @type t :: {:via, Registry, tuple}

  def worker(name) do
    via({name, :worker})
  end

  def cache(name) do
    via({name, :cache})
  end

  defp via(data) do
    {:via, Registry, {__MODULE__, data}}
  end
end

I’ve had (some) luck in the past hard coding a pid and doing the Registry lookup by that value but I’d like to leverage the power of “lookups by name” if that makes sense. Allowing the Registry (and runtime) to handle the implementation deals of that specific process PID. My next iteration that follows (after I grok more deeply how this should work) will be to supervise the EX.Cache module and allow that Supervisor to restart Cache workers (so the PID will change very much change /etc)

Note: I’m using Elixir 1.7.3 and OTP 21 on macOS High Sierra (with no plans to deploy this right now).

Thanks to everyone who drops by to read this - I’m very much in the soak it up/ learning phase and I’ll do my best to blog whatever I learn here to pay it forward so far I’ve got 26 posts up since Sept 24th with plans to keep rolling as I refine my understanding of the ecosystem, language, etc.

aseigo · October 23, 2018, 2:30pm

If I understand your question correctly, you can add something like this to EX.Registry:

@spec find(name :: String.t(), type :: atom) :: pid | :undefined
def find(name, type) do
   Registery.whereis_name({__MODULE__, {name, type}})
end

You could generalize this a bit by adding a private function to EX.Registry that generates that {__MODULE__, {name, type}} tuple, to keep the find and via functions DRY.

Also, I would be tempted to combine worker and cache into just one function … they really do the same thing, and one could view that knowing that the function is called cached vs worker is the same as knowing the type. This is not a problem later on down the road as if you wanted to change the atom used in registration you could do like:

def generate_via_value(name, :worker), do: {:via, Registry, {__MODULE__, {name, :some_other_thing}}}    
def generate_via_value(name, :cache), do: {:via, Registry, {__MODULE__, {name, :another_thing}}}

So you don’t need to worry about future maintainability, BUT until then you gain simplicity with your Ex.Registry just becoming:

def generate_via_value(name, type), do:  {:via, Registry, {__MODULE__, {name, type}}}

… and then in use you don’t have to remember that EX.Registry.cache actually returns not a cache but the via name to register with … it is obvious as the call becomes EX.Registry.generate_via_value(name, :cache)

… and THEN you can make a function in EX.Cache like:

def find(name), do: Ex.Registry.find(name, :cache)

So you can just find your cache with EX.Cache.find(name)

toranb · October 24, 2018, 2:05am

Thanks for the great reply! The only question that remains is “how do I get the name to call find?”

From the worker I need to query the Registry to find the PID for :cache but I don’t actually know the name that was used when via was called from within the start_link of Cache (unless I hard code one in the Supervisor).

If I drop the name idea completely I can lookup the cache w/ the type alone but I’m curious if you could break down “how I get the name” from worker to ask for the cache.

Here is the updated init/ continue code for my worker

    defmodule EX.Worker do
      use GenServer

      def start_link(name) do
        GenServer.start_link(__MODULE__, :ok, name: via(name))
      end

      defp via(name), do: EX.Registry.via(name, :worker)

      @impl GenServer
      def init(:ok) do
        pid = EX.Cache.find(name) #how do I get the cache name here inside Worker?
        state = EX.Cache.all(pid)
        {:ok, state, {:continue, :init}}
      end

      @impl GenServer
      def handle_continue(:init, state) do
        {:noreply, state}
      end
  end

Next the find method of my Cache

def find(name), do: EX.Registry.find(name, :cache)

And finally the Cache init/ via code to show how I register w/ name (usually) and as such, don’t know how the worker would get access to it (without hard coding something)

defmodule EX.Cache do
  use GenServer

  def start_link(name) do
    GenServer.start_link(__MODULE__, :ok, name: via(name))
  end

  defp via(name), do: EX.Registry.via(name, :cache)

  def find(name), do: EX.Registry.find(name, :cache)

  @impl GenServer
  def init(:ok) do
    {:ok, %{}}
  end
end

The registry find/via methods for clarity

defmodule EX.Registry do
  @type t :: {:via, Registry, tuple}

  def find(name, type) do
    Registry.whereis_name({__MODULE__, {name, type}})
  end

  def via(name, type) do
    {:via, Registry, {__MODULE__, {name, type}}}
  end
end

aseigo · October 25, 2018, 9:46am

I think what you are describing is service discovery and affinity. That’s a pretty general topic that spans life, love and the universe in my experience

As with pointers just being memory locations in other languages, PIDs are just the address of a process in Elixir. (More an address than a name, as the PID encodes which node a process exists on in a cluster … that first 0. in local pids means “local node”) … viewed that way, your question could be reframed as “How do I resolve a name to an address?” And the answer to that is entirely contextual to the problem your code is solving, and how it goes about solving it.

If your application has only one cache, then it is simple: your cache is called “worker_cache” or some such and you can hide that detail in a function the Cache module itself so that a caller can just do:

cache = EX.Cache.get()

where that get/0 is just sth like:

def get(), do: EX.Registry.find("worker_cache", :cache)

If you have multiple caches, then you need to ask what is the relationship between caches and workers. Does every worker have its own cache? Do workers share caches? Do all workers share all caches, or do some caches service a pool of workers? Are the caches general (“dump this data here, pls”) or topical (“this cache handles user data”)?

Once you have answered those questions, you can start to devise a plan for service discovery. If caches are topical and those topics are “well known” (e.g. “user data”, “file paths”, …), they can be keyed by such topics. The exact keys can be hidden behind functions in EX.Cache, of course, with the function names standing in as the publicly visible names for them. If the caches are specific to jobs workers are plowing through, then the identifier for the job can be used as the name of the cache. If the caches are general, but should only service N workers at a time, then Registry could also do pool management. etc…

If you want to be able to select from an arbitrary set of caches (someone starts them, and a Worker picks one “at random” to use), and therefore simply want to know which caches are currently running, then you have several options … you could use the process groups feature from a library like syn, or you could store them in an ets table so listing / finding pids (and / or names) becomes a matter of looking at the entries of a given ets table. In both those situations, you end up with just one key (e.g. :cache) which maps to the set of all existing caches. On start, the caches register (as you have in your EX.Registry) and deregister when stopped (libraries like syn make this easy). You can even have groups of caches this way: all the user data caches, or the user cache for user accounts sharded by some partitioning scheme, …

You still first need to define the mapping from Workers to Cache, which is not really an Elixir-specific topic, and then encode those findings using Registry, syn, ets, … full service discovery frameworks also exist, of course, but typically have a granularity of “node” only, not “pid”; if one views something like syn as being a light-weight service discovery system (limited to the currently cluster, providing only simple key/value based discovery, …) then those exist as well …

peerreynders · October 25, 2018, 6:26pm

Just as an FYI - there is an established Erlang idiom of a worker caching its state in an ETS table owned by it’s supervisor - Don’t Lose Your ets Tables (2011).

toranb · October 26, 2018, 2:14am

@aseigo thank you for taking the time to reply in such detail! I believe I jump’d ahead of my current conceptual understanding so to scale it back a little I’ve decided to start by registering with just a name/atom. I got a simple example working and wrote a blog post about it to document my journey however amateur it may be at the moment.

If you have time to look this over at some point I’d very much welcome feedback about how I could improve upon what I’ve got (in the most basic sense of process to process communication). Keep in mind this caching is mostly to push the limits of my understanding (GenServer to GenServer). If anything I’m trying to get a feel for how processes communicate with each other - do they just pass messages like I expect they would and that’s idiomatic, normal Elixir?

https://toranbillups.com/blog/archive/2018/10/25/process-registration/

Thanks again for all the help! Just landing a working app that rehydrates after one process crashes was the goal so in some small way it was a success!

aseigo · October 26, 2018, 7:10am

Exactly Kernel.send/2 and Process.send/3 deliver messages into a process’ inbox, which are then picked up by receive statements that pattern match on the messages. That pattern matching is super useful as it makes it easy to respond to the messages that are relevant …

GenServer’s call/cast/info functions are driven by these messages: the GenServer behaviour does the receiving and sends them on to the module’s handle_* callbacks. It makes the whole message passing a lot easier, both conceptually and to use.

This isn’t unique to GenServer, of course All the modules that are driven by messaging have something similar. Task is an example: it coordinates with the original caller by sending messages.

There is so much that can be done with this deceptively simple pattern. It allows the BEAM to run your code across multiple threads safely, is part of the shared-nothing memory management approach, and makes working with processes running in different VMs (even on different machines) that are connected as a cluster trivial as the message passing remains identical.

Enjoy your explorations! Liked your blog entry, hope to read more in future