Ecto Dynamic Repositories & Concurrency

sbuttgereit · October 23, 2021, 2:31am

Hi–

I’m in the early process of building an application that will include mutlitenancy. Given the overall design considerations, the Ecto features allowing for multiple dynamic repos looks very workable for our case. For data, we’ll only be targeting PostgreSQL and using individual databases in a PostgreSQL cluster for each tenant.

The big point that’s unclear to me right now is if the Ecto implementation is possibly subject to race conditions when concurrent database accessing processes try to set different repos via MyApp.Repo.put_dynamic_repo/1. The documentation says

From this moment on, all future queries done by the current process will run on
[repo].

But to me it’s unclear what “current process” means here. The current process that made the put_dynamic_repo call or the process related to MyApp.Repo?

So for example, let’s say I have a process that is booking a set of transactions for Tenant A… it starts off by calling put_dynamic_repo(:tenant_a) and then proceeds to process its queries… but midway through those steps, a second process comes along with work for Tenant B. That process calls put_dynamic_repo(:tenant_b) and starts its work. Assuming that the process for Tenant B makes it’s put_dynamic_repo call before the process for Tenant A is finished… does the process for Tenant A start making calls to the repo :tenant_b? Or do these processes necessarily need to be serialized in order to ensure correct execution of the database queries?

I couldn’t see it clearly enough stated for a newbie such as myself to know with certainty how that’s managed or if it’s managed. I saw a topic talking about what I’m doing here (Multitenancy with Ecto), but it didn’t address this issue. There’s enough stuff going on here in the code, and I’m sufficiently inexperienced with Elixir, that I thought asking here was probably best.

Thanks,
SCB

josevalim · October 23, 2021, 6:44am

Each request is handled by a separate process. This is not the operating system process but a lightweight thread of execution in the VM. You can literally have hundreds of thousands of those.

Processes are isolated and run concurrently. When you set the dynamic repo for one process, it does not affect any other process/request whatsoever. There is no serialization or race condition.

The information is stored on what is called “the process dictionary”. Depending on your background, it would be equivalent to “thread local variables”.

sbuttgereit · October 23, 2021, 12:32pm

Many thanks for helping me with this question.

I’m thinking about this purely in terms of Elixir/BEAM processes. The concern came from seeing a pattern, generically speaking, where a primary action (put_dynamic_repo) is setting state somewhere that subsequent actions (e.g. `MyApp.Repo.all(query)') are depending on to determine their behavior.

The information is stored on what is called “the process dictionary”. Depending on your background, it would be equivalent to “thread local variables”.

I think this is the crux of what i was hoping for. So long as the effects of put_dynamic_repo are sufficiently isolated/local so that concurrent processes all making various MyApp.Repo calls only see the repo they set with put_dynamic_repo, then life is happy. Sounds like this is the case.

And I would assume that all that had been worked out… but while I’m an Elixir newbie, I’ve got enough overall experience to know that these things shouldn’t be assumed!

Now the only sticking point is that this functionality is still marked “experimental” …

Thanks again,
SCB

josevalim · October 23, 2021, 3:01pm

I believe we removed the note in master (with no changes to the API).