Async work during Ecto Repo.transaction

Billzabob · April 28, 2022, 10:48pm

During a database transaction, we are spawning a Task. The Task ends up querying the database itself, which leads to a race condition: the transaction may or may not have been committed by the time the Task runs its query.

One solution here is to perform the query before the Task and pass its results into the Task. That way the query is run in the transaction and there is no race condition. The problem here is that we moved the query into the Task for performance reasons; Doing the query synchronously was causing timeouts.

We have tried to figure out a way to have the Task poll the transaction status in the main process and only run once it’s complete, but that seems to be impossible since that info is stored on the process dictionary of the main process.

The way we are currently leaning is to just put a Process.sleep call at the beginning of the Task so that the transaction has time to finish and commit before the Task runs, but that obviously feels super dirty.

Any other suggestions?

mbuhot · April 28, 2022, 11:06pm

The task could use a blocking receive to wait for a message from the original process before running the query.

Alternatively it might be appropriate to switch to a supervised GenServer with a handle_cast.

Billzabob · April 28, 2022, 11:13pm

The problem with that is that when we’re spawning the Task we are very deep into the callstack and the transaction is started near the top. We’d have to return the pid of the Task all the way back up so that we can send to it once the transaction is finished, which feels a little messy, especially since we don’t own a lot of the calling code.

mbuhot · April 28, 2022, 11:29pm

If you’re using Postgres, you could use Listen/Notify to wait for the transaction to complete.

NOTIFY interacts with SQL transactions in some important ways. Firstly, if a NOTIFY is executed inside a transaction, the notify events are not delivered until and unless the transaction is committed

https://hexdocs.pm/postgrex/Postgrex.Notifications.html

benwilson512 · April 29, 2022, 12:40am

This sounds like you want something like Oban, where instead of spawning a task, you would enqueue a job, and then when the transaction commits, that job can be run.

outlog · April 29, 2022, 1:13am

+1 for oban