Loggin when Task fails due to raised exception

stefpankov · July 5, 2019, 2:23pm

Hey there,

I’m new to Elixir and OTP and coming from an OO background, I’m still wrapping my head around Processes, Supervisors, Tasks etc.

I’m currently working on a little program that has to go through a bunch of files, read from them, parse the data and do something with it.

I’m trying to speed up this process by using Tasks and I’ve setup a TaskSupervisor which I then use to dynamically start Tasks that parse the data from the file.

It looks something like this

def do_the_thing(dir) do
    for file <- File.ls!(dir) do
      file_name = "#{dir}/#{file}"

      File.read!(file_name)
      |> Parser.parse!()
      |> Enum.each(fn parsed_data ->
        Task.Supervisor.start_child(MyApp.TaskSupervisor, MyTask, :run, [
          parsed_data
        ])
      end)
    end
end

Now this works fine but when an error is raised and MyTask fails, I want to somehow trap the exception and log it in a file with additional metadata so I can see where it failed.

I had a try rescue block inside the MyTask.run function but I have a feeling that’s unnecessary since I’m fine with the Task failing, the ever vigilant Supervisor handles that for me just fine.

I guess my question is, what’s considered best practices when I want to catch exit reasons and log if they interest me. I tried to setup something with Process.monitor but couldn’t really figure out how to match exceptions and not :normal exits.

david_ex · July 5, 2019, 3:25pm

I’m not sure your approach is the best for what you want to achieve (although I’m no expert): although you’ll get the pid of the child process from start_child (and could therefore monitor it), it the task fails and restarts you won’t be able to “re-monitor” it.

If handling the first failure is good enough for you, the basic steps would be

{:ok, pid} = Task.Supervisor.start_child(MyApp.TaskSupervisor, MyTask, :run, [parsed_data])
ref = Process.monitor(pid)
receive do
  {:DOWN, ^ref, :process, _object, :normal} -> nil # success => do nothing
  {:DOWN, ^ref, :process, _object, reason} -> ... # Log that task for `file_name` failed due to `reason`
end

That said, and depending on what you want to do, you can also use a GenServer and track the state of ongoing tasks. Then, you can use one of the Task.Supervisor.async_nolink to trigger a task and store the ref:

# in e.g. a handle_call
      File.read!(file_name)
      |> Parser.parse!()
      |> Enum.map(fn parsed_data ->
        %Task{ref: ref} = Task.Supervisor.async_nolink(MyApp.TaskSupervisor, MyTask, :run, [
          parsed_data
        ])
        {ref, parsed_data}
      end)
      |> Enum.into(state)

Where state is the GenServer state. To handle task results, use handle_info:

def handle_info({task_ref, task_result}, state) do
  # task was successfully completed, with `task_result`

  # we don't care about the coming `:DOWN` message for this task (which will have reason `:normal`
  # you could also just have a `handle_info` for `:DOWN` with `:normal` reason and do nothing there
  Process.demonitor(task_ref, :flush)
  {:noreply, Map.delete(state, task_ref)}
end

def handle_info({:DOWN, ref, :process, _object, reason}, state) do
  # log the fact that the task processing `Map.get(state, ref)` failed
  {:noreply, state}
end

stefpankov · July 6, 2019, 1:42pm

Thanks for your answer!

Regarding re-monitoring tasks, for now, they’re not contacting any external service so the only fail reason is missing data that is crucial, that’s why I want to get to the exception that caused them to fail and format it nicely with more information so I know exactly what caused them to fail. That means they won’t be restarted after failure and that works just fine, the only thing I need to setup is that monitoring logic.

I like the GenServer approach and read a suggestion about it somewhere else but with no example of how to do it. This is extremely helpful, I’ll try it and post progress here.

lud · July 6, 2019, 2:46pm

Hello,

Not sure if it would fit with your supervision tree, but sometimes I just use this pattern to handle failing tasks:

t1 =
  Task.async(fn ->
    Process.flag(:trap_exit, true)
    parent = self()

    t2 =
      Task.async(fn ->
        raise "failed"
        # do some heavy stuff
        result = :data
        send(parent, result)
      end)

    receive do
      {:EXIT, _, e} ->
        {:error, e}

      result ->
        Task.await(t2)
        {:ok, result}
    end
  end)

t1
|> Task.await()
|> IO.inspect(pretty: true)

stefpankov · July 8, 2019, 12:38pm

I went with a solution similar to this, a GenServer that acts as a monitor and it works as expected.

I just wanted to ask you about the reason you used async_nolink instead of start_child.

From what I can gather, the async function seem to be oriented towards the async/await approach and start_child seems to be more geared toward the fire and forget approach.

david_ex · July 8, 2019, 1:10pm

Only because it’s slightly more useful as it allows the GenServer to do something with the task result (and I had recently done something similar). But if you don’t need that, start_child should work fine also.

stefpankov · July 8, 2019, 2:53pm

Okay, thanks for the answer, this should do the job for now