Multi Tenancy with public & prefixes - how to handle migrations?

I’m planning to implement multi tenancy for my app using query prefixes. I won’t have many tenants (single digits, maybe low double digits), and I’ll be onboarding them gradually.

I want to keep some tables in the public schema, and the rest in per-tenant schemas. The docs do cover this specific use case, but don’t cover how to handle migrations.

When covering migrations elsewhere, it suggests running mix ecto.migrate --prefix "prefix_1", but that would create all tables inside the prefixed schema.

I have a couple ideas on how I could handle this, but I was hoping I could get a yay or nay from anyone who has done something similar?

Idea 1)
Add an if prefix() == "public" check to some of my migrations, and run mix ecto.migrate --prefix "prefix_1"? My concern is this approach is prone to human error.

Idea 2)
Create a seperate tenant_migrations folder. However, this involves adjusting my tasks and deployment.

For example, I’m using fly.io. It runs a release_command, that calls MyApp.Release.migrate() during deployment.

  def migrate do
    load_app()

    for repo <- repos() do
      {:ok, _, _} = Ecto.Migrator.with_repo(repo, &Ecto.Migrator.run(&1, :up, all: true))
      {:ok, _, _} = Ecto.Migrator.with_repo(repo, &Ecto.Migrator.run(&1, "PATH_TO_tenant_migrations", :up, all: true)) # something I'd add
    end
  end

This change assumes that tenant schemas can reference the public schema, but not the other way around, because migrations would run in order: public migrations first, then tenant migrations - rather than being interspersed.

And for local development, mix ecto.migrate won’t be sufficient as it would only run it for the ‘public’ schemas. I’ll need to create a new mix task to handle migrating (an augmentation of Mix.Tasks.Ecto.Migrate)


Idea 1 seems simpler and I suspect im over complicating it with Idea 2, but any opionions on these, or any better ideas, would be appreciated!

1 Like

You could draw some inspiration from Ash/AshPostgres, maybe.

Indeed it uses priv/repo/migrations and priv/repo/tenant_migrations and tasks ash_postgres.migrate and ash_postgres.migrate --tenants.

2 Likes

I have been developing a multi tenant application using Phoenix/Ecto for a last few years but I don’t have a good answer for you because I didn’t worry too much about keeping my schemas that cleanly separated. I just use the @schema_prefix notation in my Ecto schemas that are “public” and always pass a prefix when querying other schemas. Curious if there’s a particular reason you want to avoid just using the same structure in every schema (besides the pure principle of avoiding unused objects I guess)?

I did end up creating a custom migrate task that handled migrating each tenant by looping through the tenant records, which is a pattern I think I just lifted straight out of triplex in lieu of actually adding the whole library as a dep.

2 Likes

You might find some inspiration in this blog series! Underjord | Ecto & Multi-tenancy - Prefixes - Part 3

2 Likes

Oh great! Yes I will, thank you!

I have 2 large tables, probably about 90% of my database size, which consist of open goverment data that I don’t allow to be modified by the tenants. So id up having an identical set of these 2 tables per tenant incurring extra cost from the hosting provider. Also, every year I’m gonna have to add more data to these tables and it would just make the migration*tenant longer

Intresting, thank you! I’ll take a look!

Thank you for sharing!! I’ll also give this a read

It sounds like you’re assuming maintaining an identical structure across tenants means you’ll have to maintain identical data? But that’s not the case if you are willing to handle data migrations carefully. I would recommend handling data outside of migrations, but otherwise you can use the strategy in “idea 1” (wrapping data change in conditional check on value of prefix())

1 Like

A bit of a follow up to this for any future reader, I ended up doing this:

  1. In priv/repo, I created a new folder called tenant_migrations (as @serpent suggests)
  2. To add migrations to this folder, I use: mix ecto.migrate --prefix "apple" --migrations-path="priv/repo/tenant_migrations"
  3. To rollback migrations to this folder, I use: mix ecto.rollback --prefix "apple" --migrations-path="priv/repo/tenant_migrations"
  4. Then in lib/your_project/release.ex, I’ve made a small amendment to def migrate. This makes it migrate for each tenant. This is only run when deploying in production.
defmodule YourProject.Release do
  @moduledoc """
  Used for executing DB release tasks when run in production without Mix
  installed.
  """
  @app :your_project
  @tenant_prefixes ["apple", "facebook", "google"]

  def migrate do
    load_app()

    for repo <- repos() do
      {:ok, _, _} =
        Ecto.Migrator.with_repo(repo, fn repo ->
          Ecto.Migrator.run(repo, :up, all: true)

          for prefix <- @tenant_prefixes do  # this has been added
            Ecto.Migrator.run(
              repo,
              [Ecto.Migrator.migrations_path(repo, "tenant_migrations")],
              :up,
              all: true,
              prefix: prefix
            )
          end
        end)
    end
  end

  ...
end
  1. Throughout my ecto queries, I then use the prefix option. I store the prefix for the logged in user in the current_scope. For example: Repo.get(User, id, prefix: current_scope.tenant.db_prefix) (as @tfwright suggests).

The migrate/rollback commands could probably be improved so you don’t have to specify a tenant (like what ash does), by making your own mix task (based off ecto_sql/lib/mix/tasks/ecto.migrate.ex at v3.13.3 · elixir-ecto/ecto_sql · GitHub ) and having a dynamic way to get your list of tenants (as @tfwright suggests).

I ended up doing this as it was the path with the least possible changes and extra code I’d have to do. I think if you wanted an easier out-the-box setup, you should probably use ash from the get-go.


And apologies for not getting back to you, I poorly explained my scenario, in my case the data is identical.

Simplified: There is only 1 csv from the goverment, and I import all its rows into my DB (so WHERE’s can be performed on it). All of my tenants can only view this data. Since every tenant sees the same information, it makes more sense to store this table once in the public schema rather than creating an identical copy in each tenant’s schema.