Retrying operation, but how

It is unclear how the action might fail, but if it’s something you expect might happen (for example a request to an external service), then I’d say rescuing the expected exception and retrying, perhaps with some delay, would be the way to go.

Supervisors are more appropriate to recover from unexpected bugs, and I’d say they make more sense for server processes (GenServer and friends). Such processes are more like internal services which respond to various request. Due to some bug, they might occasionally fail, but after restarting they will probably work again.

In contrast, what you describe is more of a one-off job. It takes some input, does some processing, produces the output and stops. Hence, if there’s a bug, restarting won’t really help you because you’ll start with the same input which will lead you to the same failure.

However, as I said, there might exist some expected failures, such as database or some other external service not responding because of a brief network outage or overload of the other service. By rescuing the expected error, you can explicitly retry and even implement growing retry delays.

It’s also unclear whether the phoenix controller needs to wait for the result of the job. If yes, then I’d just run the job in the same process. Otherwise, I’d start a Task under some supervisor and immediately return the response (e.g. status: :queued) from the controller action.

5 Likes