Writing a new Ecto Adapter (not SQL), unable to use Ecto Migrations effectively

jstimps · February 14, 2024, 11:51pm

Hi,

I’m working on an Ecto Adapter for FoundationDB and one of my goals is to support migrations in the same manner that Ecto users are accustomed to (e.g. mix ecto.migrate). As I familiarized myself with the adapter behaviours, I learned that the Ecto.Adapter.Migration behaviour is a module in ecto_sql instead of the base ecto.

My first question: Is it reasonable or folly to continue developing a migration using ecto_sql as a depedency even though my database backend does not support SQL?

Of course I’m no stranger to folly, and I went ahead and tried it anyway, and got it mostly working. The rest of this post is related to the one blocker I had in getting it working seamlessly.

My desired implementation for Ecto.Adapter.Migration.lock_for_migrations/3 acquires a resource (a transaction reference from :erlfdb) that must then be telegraphed down to each execute_ddl. Ideally this could be directly supported by ecto_sql by allowing the behaviour to influence any future Repo operations within the execution of the migration.

Without this direct support from the library, a quick and dirty approach would be to store the reference in the Process dictionary, and then pick it back up later on in execute_ddl. However, this doesn’t work either because of the following Task creation:

Ecto.Migrator

    fn -> run_maybe_in_transaction(repo, dynamic_repo, module, fun_with_status, opts) end
    |> Task.async()
    |> Task.await(:infinity)

(The use of a Task here ensures that the rest of the migration steps have a fresh Process dictionary, preventing me from using it to keep track of data)

My second question: Why does ecto_sql use a Task in this manner?

Finally: Are there any other solutions that I’m missing without changing the underlying ecto_sql implementation? I’m reluctant to use global state such as an ets table unless it’s a last resort.

Thanks for your time!

Jesse

Schultzer · February 15, 2024, 2:10am

I manage to get it working with Process dictionaries in ecto_qlc, but the approach feels hacky at best, for the longest time I have wanted to raise the question about moving ecto migration into ecto as a behaviour and have adapters implementing them as they see fit, but other projects have been taking all my time, and I’m uncertain if the core team even would entertain the idea. Since they split ecto once. But having the migration in ecto_sql doesn’t sit right with me and I think we could greatly improve it and its API to accommodate other database backends that are not SQL.

jstimps · February 15, 2024, 2:46am

For migrations, I see that you are using Erlang distributed node locking in ecto_qlc with :global, which is clever, and a solution I hadn’t considered. I think it’s fair to require nodes to be clustered for safe migrations, as that will be the common use case anyway.

Schultzer · February 15, 2024, 2:17pm

Since the adapter is unaware of the user setup, global is a good choice, as it works regardless of whether the node is in a cluster. The adapter is still very much in the exploration stage but is usable. With the occasional bugs here and there that I’m finding while building a different product.

I think it might be worth proposing in https://groups.google.com/g/elixir-ecto to make Ecto.Migrator a behaviour and move that and Ecto.Migration into ecto main and have adapters implementing them since we both could use the flexibility and control that would bring. And it might get some more clarity over the choice already made, which is unclear for both of us, and I’m sure there will be other people in the future running into this.

/edit
Keep in mind that some people don’t use global since AFAIK the regular Erlang distribution is a full mesh, and that can bring some pain when running a large cluster, However, I don’t believe that running one large cluster is a silver bullet, and there might be more gains by divide and conquer and running multiply full mesh clusters that can communicate.