I have a controller
and a worker
module. The controller
– routes requests to the appropriate worker pid and handles worker
crashes so the client
handler doesn’t have to. The controller
presents a single consistent interface for the client
, and allows the client to continue processing in the event of a worker
crashing as a new worker
is fired up and used.
When the worker
crashes (for whatever reason), the controller
module catches the exit - e.g. thanks to the temporary monitor GenServer
sets up under the covers while waiting for a response - (thanks Sasa) - and the Controller
continues on.
However, there are race conditions obviously present when the Worker.Supervisor
doesn’t create a new child by the next time get_worker_pid
is called, making the Controller
process crash e.g. issuing a no pid found. Under the covers gproc
handles the process reg for the dynamically started worker
processes
Is there an idiomatic way to handle this cleanly?
[I’ve seen code that wraps the pid request into two methods, one residing in a pid request in the client API call and another again in a serialized server call. I’ve also seen code that uses timeouts and returns an initial noreply to later return a `GenServer.reply`. I’m also wondering if any of this pid retrieval should be moved to the `Worker.Supervisor` somehow…Anyways, would appreciate the standard go-to pattern here…]
Here’s the snippet of the controller module use case:
def handle_call({:start_worker, id}, _from, state) do
{:ok, _wpid} =
Worker.Supervisor.start_child(id)
{:reply, :ok, state}
end
def handle_call({:action, id, data}, _from, state) do
response =
try do
get_worker_pid(id) |> Worker.action(data)
catch :exit, reason ->
Logger.info "Caught exit in controller, reason is #{inspect reason}"
{:retry, reason}
end
{:reply, response, state}
end
defp get_worker_pid(id) do
case Worker.whereis(id) do
:undefined -> raise "Couldn't find worker pid"
pid -> pid
end
end
Thanks B