Does Phoenix.Ecto.CheckRepoStatus (callling Ecto.Migrator.migrations/1) need to run under an ecto migration lock?

cgrothaus · December 1, 2022, 9:29pm

Using :pg_advisory_lock ecto migration lock strategy can cause long phoenix request delays (5s ) in phoenix dev environment. This is due to Phoenix.Ecto.CheckRepoStatus calling Ecto.Migrator.migrations/1, which runs under a migration lock.

Question: is it really necessary that this call chain runs under a migration lock? In most cases, it will just detect that there is nothing to do. And hey, it’s the dev environment we’re talking about, normally there should be no concurrent migrations.

To give you more context, let me explain in more detail: A standard phoenix web app contains a Phoenix.Ecto.CheckRepoStatus plug entry in its endpoint.ex that’s active in dev environment. The plug is a convenience that checks if there are pending migrations. It calls Ecto.Migrator.migrations/1, which obtains the ecto migration lock. Under the :pg_advisory_lock strategy, when “Ecto cannot obtain the lock due to another instance occupying the lock, Ecto will wait for 5 seconds”. Oh oh. In dev environment, with CheckRepoStatus plug, you just need two web requests to hit your phoenix server at the very same time to get into that situation. We stumbled into it with a JS app that, after load, fired two simultaneous GraphQL requests. One of them would take 5s. We worked around it by tuning down the :migration_advisory_lock_retry_interval_ms setting from its default value of 5000 ms to 10 ms (just in dev). But maybe there are better options? It would be great if Ecto.Migrator.migrations/1 had an escape hatch so it didn’t need a lock.

joey_the_snake · December 1, 2022, 11:27pm

(edited original reply, I didn’t realize it’s just checking pending migrations and not running them)

Barring changes to Ecto, you could use the regular table lock for dev. It will simply wait until the lock is released instead retrying with backoff.

al2o3cr · December 1, 2022, 11:53pm

On the same hand, it’s the dev environment and normally there shouldn’t be any concurrent indexes being built either (one of the things pg_advisory_lock enables). What’s the motivation behind using :pg_advisory_lock in development?

cgrothaus · December 2, 2022, 7:07am

This was indeed my first attempt, to use :table_lock strategy in dev environment. And it worked fine!

It’s just because a colleague said “What about migrations that add an index concurrently” (like in GitHub - fly-apps/safe-ecto-migrations: Guide to Safe Ecto Migrations) that I switched to my current workaround. I just feared that such migrations could conflict with the table lock strategy, but I have to admit I didn’t try it out.

cgrothaus · December 2, 2022, 7:13am

We enabled the :pg_advisory_lock strategy for our elixir applications (like explained here and here) as a best practice, regardless whether they actually make use of concurrently creating indexes or not. We had not expected any downsides, but this assumption has now been proven wrong.

cgrothaus · December 2, 2022, 8:18am

OK, I did a quick check: when a migration concurrently adds an index, using the :table_lock strategy does not work. The migration adds the index, but later operations (presumably inserting the migration version into the schema_migrations table) are deadlocked and the whole process hangs forever.

joey_the_snake · December 2, 2022, 8:42am

I’m a bit confused. I thought in your original post you wanted no lock because you’re not worrying about concurrent migrations in dev.

cgrothaus · December 2, 2022, 9:21am

Sorry for the confusion. We’re mixing up 2 aspects of the problem.

Does Phoenix.Ecto.CheckRepoStatus need to run under a migration lock? In most cases it’s a pure read operation (when there are no pending migrations). It would be great if the developers of phoenix_ecto and ecto_sql could change the code to not need a lock.
When there are migrations to run, Ecto needs a migration lock, that’s out of question. Now which one is better? If you have migrations that concurrently add indexes, and if you do not want to completely disable the migration lock in the migration (@disable_migration_lock true) but instead choose to configure :pg_advisory_lock in config.exs (as advised in the “Safe Ecto Migrations guide”), then you cannot opt out of that for the dev environment (by configuring :table_lock in dev.exs) anymore, that doesn’t work.

Does this clear up your confusion?

LostKobrakai · December 2, 2022, 9:33am

I think what you’re running against is that checking if there are pending migrations already uses a lock. I’d probably inquire in the ecto mailing list why this is the case and if it can be changed. The mailing list is the best way of getting a hold of the ecto maintainers. Looking at the latest changes to the API suggest there might be a good reason for how this currently works – even if not for the usecase the API is used within CheckRepoStatus.

joey_the_snake · December 2, 2022, 9:54am

Thanks that clears it up. So I believe you can already disable migrations locks by using migration_lock: false. If I’m correct then maybe ecto_phoenix can be extended to allow that option to be passed into the plug. Then you can still use the option you want for your migrations in your config.

cgrothaus · December 2, 2022, 10:56am

Oh, that’s interesting. It didn’t occur to me that you simply can turn off migration locking. It works! Just a minor drawback: a migration that concurrently adds an index prints out a warning in this scenario.

It would be great if the CheckRepoStatus plug used this option.

joey_the_snake · December 2, 2022, 6:37pm

your wish was granted Allow migration_lock to be specified in check_repo_status by greg-rychlewski · Pull Request #160 · phoenixframework/phoenix_ecto · GitHub

cgrothaus · December 4, 2022, 7:59am

Wow, that went quick. This is good news. What a great community this is .