Can you rescue/catch process exits in a Task?

darkmarmot · September 9, 2018, 6:30pm

I am kicking off a large number of GenServer calls to actors, each within an async Task.

If one of the GenServer processes dies, I get an :exit from :no_proc.

I don’t want my entire task orchestration to go down from one missing process, though.

I tried wrapping the Task function in try/catch/rescue as well as the GenServer.call part in a try/catch/rescue but the :exit still happens and takes everything down.

I am trying to match the rescue and catch on everything such as

try do
    GenServer.call(via_tuple(x), :some_func)
catch
    e -> {:error, e}
 rescue
   e -> {:error, r}
end

Am I not understanding something basic or am I going about things in the wrong way?

Any help would be sincerely appreciated!

Thanks,
Scott S.

darkmarmot · September 9, 2018, 6:52pm

And of course I figured it out right after posting —

you need to catch
exit, reason -> …

which must be a macro thing but doesn’t that seem against the way pattern matching works
everywhere else in Elixir???

peerreynders · September 9, 2018, 9:23pm

It’s a bit more complicated than that:

defmodule Demo do

  def run(arg) do
    try do
      result = do_it(arg)
      IO.puts("Value from local return #{inspect result}")

    rescue
      e in ArgumentError ->
        IO.puts("An Argument error occured #{inspect e}")
        # Look at Kernel.defexception/1 and the Exception behaviour

    catch
      :exit, {:error, msg} ->
        IO.puts("Exit from a called function: #{msg}")
        # BUT a callee inside the same process can decide to
        # catch the EXIT and clean up or recover
        # Process.exit/2 CANNOT be caught - only trapped

      x ->
        IO.puts("Value from non local return #{inspect x}")
        # BUT if we don't catch this value, the process WILL terminate
        # because a non local return is intended to be caught SOMEWHERE
        # so if the thrown value isn't caught then that is a problem

    end
  end

  defp do_it(arg) do

    case arg do
      :non_local_return ->
        throw :catch_me_if_you_care # non local return,
                                    # i.e. caught by the first catch
                                    # NOT considered an error
      :exit_now ->
        exit({:error, "I'm in so much trouble"}) # Kernel.exit/1 is different
                                                 # from Process.exit/2
                                                 # exit/1 is used to indicate that
                                                 # the process logic has encountered
                                                 # an unexpected problem
     :raise_now ->
        raise(ArgumentError, "Highway to Hell")
        # Acts more like a conventional exception were information
        # is carried by a specific error datastructure

      _ ->
        arg # local return - just return argument
    end

  end

end

Demo.run(:result)
Demo.run(:non_local_return)
Demo.run(:exit_now)
Demo.run(:raise_now)

$ elixir demo.exs
Value from local return :result
Value from non local return :catch_me_if_you_care
Exit from a called function: I'm in so much trouble
An Argument error occured %ArgumentError{message: "Highway to Hell"}
$

I don’t want my entire task orchestration to go down from one missing process, though.

Task.async by default uses proc_lib.spawn_link which means that the EXIT signals will come back to the spawning process. You could conceivably use Task.start instead and roll your own async/await.

defmodule Demo do

  def run(delay) do
    t = async(my_fun(delay))

    try do
      result = await(t, 5000)
      IO.puts("Got: #{inspect result}")

    catch
      :exit, reason ->
        IO.puts("EXIT: #{inspect reason}")
    end

  end

  def my_fun(delay) when is_integer(delay) do

    fn ->
      {timeout, crash} =
        cond do
          delay >= 0 ->
            {delay, false}
          true ->
            {-delay, true}
        end

      Process.sleep(timeout)

      cond do
        crash ->
          exit(:crash) # crash the task
        true ->
          timeout      # return result
      end
    end

  end

  #
  # ---
  #

  def async(fun) do
    owner = self()
    {:ok, pid} = Task.start(reply(fun, 5000))
    ref = Process.monitor(pid)
    send(pid, {owner, ref})
    %Task{pid: pid, ref: ref, owner: owner}
  end

  def await(%Task{ref: ref, owner: owner} = task, timeout) when owner == self() do
    receive do
      {^ref, reply} ->
        Process.demonitor(ref, [:flush])
        reply

      {:DOWN, ^ref, _, _proc, reason} ->
        exit({reason, {__MODULE__, :await, [task, timeout]}})

    after
      timeout ->
        Process.demonitor(ref, [:flush])
        exit({:timeout, {__MODULE__, :await, [task, timeout]}})
    end
  end

  defp reply(fun, timeout) do
    fn ->
      receive do
        {caller, ref} ->
          send(caller, {ref, fun.()})
      after
        timeout ->
          exit(:timeout)
      end
    end
  end

end

Demo.run(500)
Demo.run(-500)
IO.puts("Demo complete")

$ elixir demo.exs
Got: 500

20:24:37.529 [error] Task #PID<0.95.0> started from #PID<0.89.0> terminating
** (stop) :crash
    demo.exs:32: anonymous fn/1 in Demo.my_fun/1
    demo.exs:72: anonymous fn/2 in Demo.reply/2
    (elixir) lib/task/supervised.ex:89: Task.Supervised.do_apply/2
    (stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
Function: #Function<1.36972197/0 in Demo.reply/2>
    Args: []
EXIT: {:crash, {Demo, :await, [%Task{owner: #PID<0.89.0>, pid: #PID<0.95.0>, ref: #Reference<0.3218396471.1404567555.99609>}, 5000]}}
Demo complete
$

darkmarmot · September 10, 2018, 12:30am

That’s an awesome breakdown, thanks!

I still find it exceptionally strange that the catch :exit has a comma there (as opposed to using a tuple).

peerreynders · September 10, 2018, 1:22am

as opposed to using a tuple

That’s because historically Erlang (and therefore the BEAM) doesn’t use tuples in catch:

error:Error (i.e. :error, reason) is considered an exception
exit:Exit (i.e. :exit, reason) is an internal exit
Value (i.e. value) is a thrown value from a non local return.