Hello,
I have an Elixir app where a process depends on another process to be alive and send messages back. This second process is mainly a database connection and may occasionally fail. I created an example app below, which mimics the structure. ModuleA
my worker process, ModuleB
is the database connection:
defmodule Test.ModuleA do
use GenServer
def start_link(opts \\ []) do
name = Keyword.get(opts, :name, __MODULE__)
GenServer.start_link(__MODULE__, nil, name: name)
end
def init(_) do
IO.puts("#{__MODULE__} starting")
{:ok, nil}
end
end
defmodule Test.ModuleB do
use GenServer
def start_link(opts \\ []) do
name = Keyword.get(opts, :name, __MODULE__)
GenServer.start_link(__MODULE__, nil, name: name)
end
def init(_) do
IO.puts("#{__MODULE__} starting")
Process.send_after(self(), :crash, 5000)
{:ok, nil}
end
def handle_info(:crash, _state) do
raise "boom"
end
end
Using the application’s main supervisor like this everything works as expected, both processes are restarted when the DB connection (ModuleB
) crashes:
def start(_type, _args) do
children = [
# Starts a worker by calling: Test.Worker.start_link(arg)
# {Test.Worker, arg}
Test.ModuleB,
Test.ModuleA
]
# See https://hexdocs.pm/elixir/Supervisor.html
# for other strategies and supported options
opts = [strategy: :rest_for_one, name: Test.Supervisor]
Supervisor.start_link(children, opts)
end
But because the ModuleB
is not really part of my application but of an external package we use, I don’t really can supervise it in my supervisor. Therefore my idea was to link ModuleA
to ModuleB
like so (and also changed my supervisors strategy back to :one_for_one
):
defmodule Test.ModuleA do
use GenServer
def start_link(opts \\ []) do
name = Keyword.get(opts, :name, __MODULE__)
GenServer.start_link(__MODULE__, nil, name: name)
end
def init(_) do
IO.puts("#{__MODULE__} starting")
Test.ModuleB
|> Process.whereis()
|> Process.link()
{:ok, nil}
end
end
However this results in a very weird problem where my app completely crashes after two simulated connection problems:
What’s going on? Did I misunderstood process links and/or supervisor config?