Using Ecto to run a long-running multi-process transaction

TL;DR the short answer is no, the long answer is yes

There is not any concurrency when doing database calls, they block and are applied in strict order. Actually the calling process accesses the database connection socket directly (and not via a connection process). This is very efficient as it minimises copying and message passing, and garbage collection from one transaction does not effect other callers.

Transactions combined with pooling is a difficult problem. We need to ensure that a lock is held on a database connection while the transaction takes place and only the desired process(es) access that database connection, begin is always run first, commit/rollback is always called last, and the database connection is released eventually. We can provide these guarantees if the transaction occurs inside a single function call. Even in this situation the error handling is non-trivial. I think it is unrealistic to expect most users to handle this correctly and I would not be confident to handle this myself.

This means that only one process can access a process from the pool at a time and a transaction must be run side a single function call.

This approach is very difficult because :gen_server.enter_loop never returns and if the GenServer hibernates the call stack is thrown away - so the transaction would never be committed. Of course it is possible to wrap the enter_loop in a try and catch a :normal exit, just make sure never to hibernate.

As explained above I don’t think it is advisable to provide this. However you may have noticed that Ecto.Adapters.SQL.Sandbox provides a similar mechanism. A lock is acquired on a connection and this process (and possibly others) can access it many times. Of course we wouldn’t want the transaction/savepoint semantics of the sandbox. Fortunately this pool is powered at the low level by the DBConnection.Ownership pool, which does not apply the transaction/savepoint semantics. It would be possible to either use the ownership based pool, or build on it in a similar way to the sandbox does to provide guarantees on the transaction. I think the later would be preferable because it makes it easier to guarantee the transaction semantics. This could be achieved by copying the sandbox implementation and altering the ownership check in slightly such that there are 2 different ownership checkins, commit and rollback.

We would need to resolve this in DBConnection before Ecto. Pooling is separate from transaction handling there. Hopefully something related will appear before my Elixir Conf EU talk ;).

3 Likes