Why does the registry 'behaviour' both contain 'whereis_name' and 'send'?

This is a question regarding the implementation of custom process registries.

While my actual registry is going to be in an Elixir module, to provide part of a ‘Persistent GenServer’ implementation (starting up processes that are currently ‘off’ but persisted on disk when asked for them), the question itself is more related to via-tuples and the process registry behaviour as defined by Erlang.

(See for instance the Erlang documentation of gen_server:start_link:

So in this paragraph, the registry ‘behaviour’ (in a loose, implicit sense) is defined as “you need to implement these four functions to work the same as in global”.

My question, then, is about when these various functions are being called: When you pass a via-tuple to a function that sends a message, does the runtime system (or OTP):

  1. Call your_registry_module:send/2?
  2. Call your_registry_module:whereis_name/1 and then use the returned PID to send directly?

And is there a difference between these depending on when we use gen_server:call/cast vs the raw sending of messages? Why do both of these functions exist, since one of them can be implemented with the other?

1 Like
cast({via, Mod, Name}, Request) ->
    catch Mod:send(Name, cast_msg(Request)),
    ok;

Source

I’d imagine it’s done that way because retriving the pid and doing a send afterwards is prone to race conditions. I’m not actually sure if this is more save against it, but it probably cannot hurt.

Only gen_* modules are actually using via-tuples. send does only know about pids, ports and the local name registry.

3 Likes

I wonder… Since a process can die at any time, wouldn’t the client have to deal with this possiblity (that a message sent to a process might never arrive) anyway?

I suspect that’s just because it would be inconvenient to do whereis_name + send every time when using global.

source

send(Name, Msg) ->
    case whereis_name(Name) of
	Pid when is_pid(Pid) ->
	    Pid ! Msg,
	    Pid;
	undefined ->
	    exit({badarg, {Name, Msg}})
    end.

send/2 uses whereis_name/1 - i. e.

2 is how 1 is implemented.

source

where(Name) ->
    case ets:lookup(global_names, Name) of
	[{_Name, Pid, _Method, _Ref}] ->
	    if node(Pid) == node() ->
		    case is_process_alive(Pid) of
			true  -> Pid;
			false -> undefined
		    end;
	       true ->
		    Pid
	    end;
	[] -> undefined
    end.

Thanks for diving into the source!

This confuses me even further: It now is definitely the case that the ‘race condition’ argument holds absolutely no value.

The only reason I can then think of, is for convenience sake?

The race condition is incredibly slight as the registry removes the name as soon as the :DOWN message hits and there is the general understanding that erlang:send will not fail for an unreachable destination - there is no guarantee of delivery, much less processing.

So the message could arrive in the recipients mailbox and the process could die before a response is returned (in case of a call).

That is why call slaps a monitor on the recipient first thing:

source

do_call(Process, Label, Request, Timeout) when is_atom(Process) =:= false ->
    Mref = erlang:monitor(process, Process),

    %% OTP-21:
    %% Auto-connect is asynchronous. But we still use 'noconnect' to make sure
    %% we send on the monitored connection, and not trigger a new auto-connect.
    %%
    erlang:send(Process, {Label, {self(), Mref}, Request}, [noconnect]),

    receive
        {Mref, Reply} ->
            erlang:demonitor(Mref, [flush]),
            {ok, Reply};
        {'DOWN', Mref, _, _, noconnection} ->
            Node = get_node(Process),
            exit({nodedown, Node});
        {'DOWN', Mref, _, _, Reason} ->
            exit(Reason)
    after Timeout ->
            erlang:demonitor(Mref, [flush]),
            exit(timeout)
    end.

The via is sorted out in this wrapper function - which uses whereis_name/1 to get the actual pid.

source

do_for_proc(Process, Fun)
  when ((tuple_size(Process) == 2 andalso element(1, Process) == global)
	orelse
	  (tuple_size(Process) == 3 andalso element(1, Process) == via)) ->
    case where(Process) of
	Pid when is_pid(Pid) ->
	    Node = node(Pid),
	    try Fun(Pid)
	    catch
		exit:{nodedown, Node} ->
		    %% A nodedown not yet detected by global,
		    %% pretend that it was.
		    exit(noproc)
	    end;
	undefined ->
	    exit(noproc)
    end;

So gen_server’s

  • cast uses the registry’s send/2 which in turn uses whereis_name/2
  • call only uses whereis_name/2, so that it can monitor and then proceeds to directly send to the recipient.

So

  • castsend/2
  • callwhereis_name/1
6 Likes