Ecto.Migrator.run on Application start, bad idea?

I am thinking about doing something like this:

defmodule MyApp.Application do
  use Application

  def start(_type, _args) do
    children = [MyApp.Repo]
    opts = [strategy: :one_for_one, name: MyApp.Supervisor]
    with {:ok, pid} <- Supervisor.start_link(children, opts) do
      MyApp.Repo.run_migrations()
      {:ok, pid}
    end
  end
end

defmodule MyApp.Repo do
  use Ecto.Repo,
    otp_app: :my_app,
    adapter: Ecto.Adapters.Postgres

  def run_migrations, do:
    Ecto.Migrator.run(__MODULE__, migrations_path(), :up, all: true)

  defp migrations_path, do:
    Path.join[:code.priv_dir(:my_app), "repo", "migrations"])

end

Would that cause any problems?

1 Like

If anything, I would run migrations before starting application, as this can cause data race.

BTW if you want to run something after application starts then you can use :start_phases in application/0 (I have blogpost about this that I want to publish soon).

Data races would only occur when something would use the application right?

The main reason I’m thinking of doing this, is to have dependencies run their own migrations automatically when the dependent application starts.

So the dependency would be listed in extra_applications and would start first (and run migrations) before the dependent application (the user) starts.

I figured that capturing the return value {:ok, pid} and running migrations before returning it, would be ok.

I’m very interested in your upcoming post.

In case of application that is dependency I would do not contain any Repo nor migrations directly but instead I would provide migration generator. Like Oban does for example.

I didn’t find any multi database / multi git repo Elixir application, so this is kind of an experiment.

Probably micro services with Kubernetes or something like that would be “better” though that’s not the kind of complexity I’m looking for.
I looked into umbrella applications but I don’t like that approach.
Also running multiple applications behind a reverse proxy would just add complexity and more boundaries.

So far the dependency strategy has worked very well with a GraphQL frontend as the dependent and resolvers calling the dependencies (backends).

The important part here is that a database is a singleton and you want to keep control over it in one place. This usually is the application with the MyApp.Repo. Any libraries for MyApp should not migrate anything on their own, but only supply easy to use API to MyApp to integrate into its own migrations. This makes debugging and reason about the database state way more sane than having to look in each and every dependency what make your application break if that happens.

Also I’m not really sure how this should be related to umbrella projects/kubernetes.

2 Likes

In case of many databases the database is not a singleton, that’s the point.
So there will be many repo’s.
Databases know if they need migrations or if they are already up.

I find this a really interesting question.

I think that in production, it might make sense to (automatically) run migrations once a deployment has finished.

In development, however, not so much. I think there are many situations in which I do not want to run a migration yet. For instance:

  • I am still working on the code but want to check something in IEx in the meantime. Or what about running tests?
  • I am switching between multiple branches in version control. They often will have different database configurations. In this situation I also want manual control over what is going on.

The example is very basic, in practice the application should also be able to rollback and so on.
Only when an application is ready to use and well
tested (seperately) it’s plugged into the frontend as a dependency.