I am extending an existing application to allow users to be able to write and send newsletters and struggling with determining a proper supervision tree (I’m relatively new to OTP so please bear with me).
Overview
Users can create a single Newsletter
, which has “subscribers” via a Subscription
schema. A user may write multiple Letter
s within a given newsletter, each of which may be sent to a subset of their subscribers, which I’ll call “recipients”.
After a user writes a Letter
and goes to send it, the plan is to kick off a background job to handle the sending of letters to all recipients.
First Attempt
- Create a
Task.Supervisor
calledApp.MailmanSupervisor
and add it to the top-level Application supervisor as a child. - Create a
App.LetterDeliveryDispatcher
, which is aDynamicSupervisor
that will supervise individualApp.LetterDeliveryService
s, one for each Letter to send. This is also started by the top levelApplication
. App.LetterDeliveryService
is aGenServer
that manages sending a singleLetter
to all specified recipients. These are dynamically created (by callingApp.LetterDeliveryDispatcher.add_child
when it is time to send a given letter.
The App.LetterDeliveryService
This GenServer
would take the Letter.id
as a parameter to start_link/1
and add a Map called deliveries
to its state in init/1
to keep track of the status of individual deliveries.
Then, to handle the sending of letters to individual recipients, use Task.Supervisor.async_stream_nolink/4
from a handle_continue/2
callback. Process and filter the results, handling {:exit, reason}
tuples (I’m using zip_input_on_exit
with async_stream_nolink
) so that they include their result. Ultimately, update the deliveries
map in the state with: {:stop, :normal, %{state | deliveries: deliveries}
based on these return values, which would yield a map looking like:
%{
1 => :pending,
2 => {:delivered, 34},
3 => {:failed, :timeout},
}
The individual Task workers would receive a recipient id and they would:
- Query to get the full
Subscription
record using the recipient id - Determine if and how to send to this subscriber based on some business logic
- Send the letter to the recipient if applicable
- Create a
NewsletterEmail
database record with foreign keys to both theSubscription
and theLetter
along with some other metadata related to the sent email.
Retries
One disadvantage of using a TaskSupervisor
is that there isn’t any retry logic for individual worker tasks (the ones sending emails and doing database queries and updates). So, I was thinking of using the retry logic of the LetterDeliveryService
genserver by setting it to :transient
and exiting without :normal
if any of its child Tasks return {:exit, _}
so that it is restarted by the LetterDeliveryDispatcher
.
The key here is that I’d save the deliveries
state with this exit signal in terminate/2
. I was thinking of doing this in a jsonb
column on Letter
called mail_order
, which would hold information about the current delivery.
When the LetterDeliveryService
is restarted (recall that it takes a Letter.id
) I would then check the letter’s mail_order
field and see if any of the letters still need to be sent. If not, I’d terminate with :normal
and if so, I’d try to send those letters again, repeating the process.
I could also easily amend deliveries
(which are persisted in the mail_order
column on a Letter
) to hold information on number of retries so I could control the number of retries.
Questions
-
First, this is my first serious attempt at using various OTP behaviours and so I’m not sure whether this is a good approach or whether I’m overthinking things. Would you tackle this in another way?
I thought I should have a long-running
GenServer
(LetterDeliveryService
) per-letter to track the sending to all of its recipients, but I could also imagine a scenario in which the mailing to individual recipients are handled in one sort of “queue” irrespective of theLetter
they are responsible for delivering. -
If the approach is mostly correct, should I use the
async_stream_nolink
method I described above, handling all results once they come in, or would it be better to useasync_nolink
and use thehandle_info
callbacks of theLetterDeliveryService
GenServer
? -
Is there a better way to handle retry logic?
-
Anything else I’m missing?
Finally, I am aware that there are third-party libraries like Oban
and Parent
but I’d like to try to do this without relying on third-party libraries unless it really makes sense to include them. I want to make sure I understand how to architech such a system with OTP before integrating such libraries. But, if there is a very strong case to be had that one of those (or another) library is really what I should be using then of course I will use it.