Loggin when Task fails due to raised exception

Hey there,

I’m new to Elixir and OTP and coming from an OO background, I’m still wrapping my head around Processes, Supervisors, Tasks etc.

I’m currently working on a little program that has to go through a bunch of files, read from them, parse the data and do something with it.

I’m trying to speed up this process by using Tasks and I’ve setup a TaskSupervisor which I then use to dynamically start Tasks that parse the data from the file.

It looks something like this

def do_the_thing(dir) do
    for file <- File.ls!(dir) do
      file_name = "#{dir}/#{file}"

      File.read!(file_name)
      |> Parser.parse!()
      |> Enum.each(fn parsed_data ->
        Task.Supervisor.start_child(MyApp.TaskSupervisor, MyTask, :run, [
          parsed_data
        ])
      end)
    end
end

Now this works fine but when an error is raised and MyTask fails, I want to somehow trap the exception and log it in a file with additional metadata so I can see where it failed.

I had a try rescue block inside the MyTask.run function but I have a feeling that’s unnecessary since I’m fine with the Task failing, the ever vigilant Supervisor handles that for me just fine.

I guess my question is, what’s considered best practices when I want to catch exit reasons and log if they interest me. I tried to setup something with Process.monitor but couldn’t really figure out how to match exceptions and not :normal exits.

I’m not sure your approach is the best for what you want to achieve (although I’m no expert): although you’ll get the pid of the child process from start_child (and could therefore monitor it), it the task fails and restarts you won’t be able to “re-monitor” it.

If handling the first failure is good enough for you, the basic steps would be

{:ok, pid} = Task.Supervisor.start_child(MyApp.TaskSupervisor, MyTask, :run, [parsed_data])
ref = Process.monitor(pid)
receive do
  {:DOWN, ^ref, :process, _object, :normal} -> nil # success => do nothing
  {:DOWN, ^ref, :process, _object, reason} -> ... # Log that task for `file_name` failed due to `reason`
end

That said, and depending on what you want to do, you can also use a GenServer and track the state of ongoing tasks. Then, you can use one of the Task.Supervisor.async_nolink to trigger a task and store the ref:

# in e.g. a handle_call
      File.read!(file_name)
      |> Parser.parse!()
      |> Enum.map(fn parsed_data ->
        %Task{ref: ref} = Task.Supervisor.async_nolink(MyApp.TaskSupervisor, MyTask, :run, [
          parsed_data
        ])
        {ref, parsed_data}
      end)
      |> Enum.into(state)

Where state is the GenServer state. To handle task results, use handle_info:

def handle_info({task_ref, task_result}, state) do
  # task was successfully completed, with `task_result`

  # we don't care about the coming `:DOWN` message for this task (which will have reason `:normal`
  # you could also just have a `handle_info` for `:DOWN` with `:normal` reason and do nothing there
  Process.demonitor(task_ref, :flush)
  {:noreply, Map.delete(state, task_ref)}
end

def handle_info({:DOWN, ref, :process, _object, reason}, state) do
  # log the fact that the task processing `Map.get(state, ref)` failed
  {:noreply, state}
end
2 Likes

Thanks for your answer!

Regarding re-monitoring tasks, for now, they’re not contacting any external service so the only fail reason is missing data that is crucial, that’s why I want to get to the exception that caused them to fail and format it nicely with more information so I know exactly what caused them to fail. That means they won’t be restarted after failure and that works just fine, the only thing I need to setup is that monitoring logic.

I like the GenServer approach and read a suggestion about it somewhere else but with no example of how to do it. This is extremely helpful, I’ll try it and post progress here.

1 Like

Hello,

Not sure if it would fit with your supervision tree, but sometimes I just use this pattern to handle failing tasks:

t1 =
  Task.async(fn ->
    Process.flag(:trap_exit, true)
    parent = self()

    t2 =
      Task.async(fn ->
        raise "failed"
        # do some heavy stuff
        result = :data
        send(parent, result)
      end)

    receive do
      {:EXIT, _, e} ->
        {:error, e}

      result ->
        Task.await(t2)
        {:ok, result}
    end
  end)

t1
|> Task.await()
|> IO.inspect(pretty: true)
1 Like

I went with a solution similar to this, a GenServer that acts as a monitor and it works as expected.

I just wanted to ask you about the reason you used async_nolink instead of start_child.

From what I can gather, the async function seem to be oriented towards the async/await approach and start_child seems to be more geared toward the fire and forget approach.

Only because it’s slightly more useful as it allows the GenServer to do something with the task result (and I had recently done something similar). But if you don’t need that, start_child should work fine also.

Okay, thanks for the answer, this should do the job for now :smile: