There is another possibility no one mentioned. You can turn the database table into a work queue itself.
I am currently in a process of doing so on a project, where we initially had the solution @josevalim presented here, i.e. Transaction + Repo.stream + async workers, but we’re ditching it for the reasons of keeping the connection while it’s not necessary to, but also for the reason that this causes race condition if you have 2+ nodes on the cluster. So you have to make sure this process that iterates over database records is initialized on one of the nodes, which requires something like global registration and just complicates things a lot.
So, in your case you could, I think, simply use Honeydew + Ecto queue. https://github.com/koudelka/honeydew I think @koudelka could confirm since he’s on the forum, but that is going to work in your case, and I think pretty well. What it does is adds couple of fields to the table and then does a smart update to lock the next available row, it can also handle retries and failed jobs etc. fairly easily.
We actually used Honeydew on the same project before, replaced it with Exq for the most part, but we are re-introducing Honeydew-like behavior by doing what it does, just slightly differently. We need a bit more flexibility in how and which records we pick, and Honeydew really assumes you want to process all records one by one after they been inserted, and we need something different: processing subset of records at undefined moment in time in future when some conditions change (account subscription status to be precise). So we will not use Honeydew but basically do the same it does in order to lock rows and go one by one through the queue.
For the reference, these are the building blocks you need to build the queue:
https://dba.stackexchange.com/a/69497 - check out the section that talks about “FOR UPDATE SKIP LOCKED” as this is crucial to building queue system on top of PostgreSQL without race conditions and deadlocks, and also how they do it in Honeydew
And if you have a process that goes like that over list of rows from database, fetching and locking one after another - there is no reason why you can’t make this parallel also throughout the cluster. You just start a supervisor that starts say 5 workers on each node on the cluster, and you can have 5* N nodes workers without actually blocking 5 * N database connections.
I will probably put together a blog post once this is running on our end, but that won’t be today