How can I use dynamic repo without default repo?

nallwhy · June 30, 2022, 2:56am

I’m trying to apply dynamic repo to my SaaS project.
This project reads the customer’s database to retrieve data.

I followed Replicas and dynamic repositories — Ecto v3.8.4.

defmodule MyApp.Dynamic.Repo do
  use Ecto.Repo, ...

  def with_dynamic_repo(credentials, callback) do
    default_dynamic_repo = get_dynamic_repo()
    start_opts = [name: nil, pool_size: 1] ++ credentials
    {:ok, repo} = MyApp.Repo.start_link(start_opts)

    try do
      MyApp.Repo.put_dynamic_repo(repo)
      callback.()
    after
      MyApp.Repo.put_dynamic_repo(default_dynamic_repo)
      Supervisor.stop(repo)
    end
  end
end

But it throws errors without starting the repo on the application.

defmodule MyApp.Application do
  use Application

  @impl true
  def start(_type, _args) do
    children = [
      # MyApp.Dynamic.Repo
    ]

    Supervisor.start_link(children, strategy: :one_for_one, name: MyApp.Supervisor)
  end
end

But I don’t have a default database for that repo, so I can’t start the repo on application.
Should I run a database to use dynamic repo?

mindok · June 30, 2022, 7:03am

What errors does it throw?

nallwhy · June 30, 2022, 7:47am

It throws repo not started error.

** (RuntimeError) could not lookup Ecto repo MyApp.Dynamic.Repo because it was not started or it does not exist

nallwhy · July 3, 2022, 2:55am

The reason why the application doesn’t have a default repo is that it will supports many kinds of RDBMS like MySQL, MSSQL.
I think it’s not a good solution that makes dummy databases for each RDBMS.

sbuttgereit · July 3, 2022, 3:53am

Hmmm… at first I though I understood what you were asking, but looking deeper I’m less convinced I understand the issue.

I’m chiming in because I’m using dynamic repos for what may be similar purposes. I also have no default and the setup is working well for me. I’ll start from the beginning with my use case and maybe you can see how it aligns with yours. I’m also working on a multi-tenant application, but only targeting PostgreSQL. The nature of my application and its use cases/usage patterns allow me to use a database per tenant instance of the application. In addition I have a management database which controls things, but I don’t want that (or any one tenant) to be considered the “default” database or repo within the application. If a dynamic repo has not been set for the process via put_dynamic_repo, I want any attempted uses to fail. There simply is no default.

Now having said that, there are bits of the Repo I need running. So for example I do have an Ecto configuration setup in config.exs, but it’s very simple:

import Config

config :msbms_syst_datastore,
  ecto_repos: [MsbmsSystDatastore.Runtime.Datastore]

(assume that the MsbmsSystDatastore.Runtime.Datastore module above is just the repo module, with use Ecto.Repo because that’s what it is with just a different name).

The top of MsbmsSystDatastore.Runtime.Datastore looks like:

defmodule MsbmsSystDatastore.Runtime.Datastore do
  @moduledoc false

  use Ecto.Repo,
    otp_app: :msbms_syst_datastore,
    adapter: Ecto.Adapters.Postgres

[CLIP]

end

And that’s all that’s there related to Ecto.Repo. I do identify the adapter for the repo.

I believe, but am not sure, that having that configuration is sufficient for Ecto to start up things like it’s process registry and it’s own necessary components. Note that there is no database configuration beyond identifying the adaptor to use. Just the enumeration of the repo module existing. In fact, that’s the whole of my config.exs file.

This particular application has no “MyApp.Application” module nor does it start anything up special.

Are you configuring anything at compile time?

cjbottaro · July 3, 2022, 9:49pm

Full disclosure, I’m drunk and didn’t read this entire thread, but…

Should I run a database to use dynamic repo?

Yes. We have a multi-tenant app. We have one Postgres database that has a clients table which basically has a mapping of {client_name: database_instance}. This database is a “normal” Ecto.Repo.

Each “shard” (aka client specific database) is also a normal Ecto.Repo, and all of them have to be started up on application boot.

defmodule Datastores.Shard.Supervisor do
  @moduledoc false

  def child_spec(config) do
    %{
      id: __MODULE__,
      start: {__MODULE__, :start_link, [config]},
      type: :supervisor
    }
  end

  def start_link(config \\ []) do
    Datastores.Shard.dynamic_repos()
    |> Enum.map(fn name ->
      config = Keyword.merge(Datastores.Shard.config(name), config)
      %{
        id: {Datastores.Shard, name},
        start: {Datastores.Shard, :start_link, [config]}
      }
    end)
    |> Supervisor.start_link(strategy: :one_for_one, name: __MODULE__)
  end

  def stop do
    Supervisor.stop(__MODULE__)
  end

end

So once all the “shards” are started, then you can use get_dynamic_repo to reference them.

Happy to help more once I’m more sober. I really really like put_dynamic_repo, get_dynamic_repo, and c:default_options. Between those functions/callbacks, it’s pretty easy to make a multi-tenanted Ecto application… until you get to migrations…

chasers · October 23, 2023, 8:53pm

Would love to hear how you’re handling migrations.

This is how we’re doing it:

github.com

supabase/realtime/blob/main/lib/realtime/tenants/migrations.ex

defmodule Realtime.Tenants.Migrations do
  @moduledoc """
  Run Realtime database migrations for tenant's database.
  """

  use GenServer, restart: :transient

  require Logger
  alias Realtime.Repo

  alias Realtime.Tenants.Migrations.{
    CreateRealtimeSubscriptionTable,
    CreateRealtimeCheckFiltersTrigger,
    CreateRealtimeQuoteWal2jsonFunction,
    CreateRealtimeCheckEqualityOpFunction,
    CreateRealtimeBuildPreparedStatementSqlFunction,
    CreateRealtimeCastFunction,
    CreateRealtimeIsVisibleThroughFiltersFunction,
    CreateRealtimeApplyRlsFunction,
    GrantRealtimeUsageToAuthenticatedRole,

This file has been truncated. show original

cjbottaro · October 26, 2023, 9:43pm

I just wrote a custom Mix task. It’s a pretty bespoke solution; lots of code specific to our app.

defmodule Mix.Tasks.Shard.Migrate do
  use Mix.Task
  require Ecto.Query
  import Mix.Shard

  @shortdoc "Runs the client schemas migrations"
  @moduledoc """
  Runs the pending migrations one or more client schemas.

  ## Examples

      mix shard.migrate
      mix shard.migrate -c 50

      mix shard.migrate -n 3
      mix shard.migrate --step 3

      mix shard.migrate --to 20080906120000

      mix shard.migrate -o dev -o alpha

  ## Command line options

    * `--all` - run all pending migrations

    * `--step`, `-n` - Run n number of pending migrations

    * `--to` - Run all migrations up to and including version

    * `--quiet` - Do not log migration commands

    * `--log-sql` - Log the raw sql migrations are running

    * `--concurrency`, `-c` - Set concurrency. Defaults to 20.

    * `--only`, `-o` - Run for specified clients only.

  """

  @switches [
    all: :boolean,
    step: :integer,
    to: :integer,
    quiet: :boolean,
    concurrency: :integer,
    log_sql: :boolean,
    only: [:string, :keep],
  ]

  @aliases [
    n: :step,
    c: :concurrency,
    o: :only,
  ]

  @impl true
  def run(args \\ []) do
    Mix.Task.run("app.config")
    Logger.configure(level: :info)
    {opts, _} = OptionParser.parse! args, strict: @switches, aliases: @aliases

    opts = if opts[:to] || opts[:step] || opts[:all],
      do: opts,
      else: Keyword.put(opts, :all, true)

    run(:up, opts)
  end

  def run(dir, opts) do
    concurrency = Keyword.get(opts, :concurrency, 20)

    # Each migration needs two connections?
    start_repos(pool_size: concurrency * 2)

    # Load synchronously so we don't get code compilation issues.
    migrations = load_migrations()

    clients_query(opts, :only)
    |> Datastores.Repo.all()
    |> Task.async_stream(&run(&1, migrations, dir, opts), max_concurrency: concurrency)
    |> Enum.each(fn {:ok, _} -> nil end)

    :ok
  end

  def run(client, migrations, dir, opts) do
    Datastores.set(client.name)

    dynamic_repo = String.to_atom(client.csn)
    prefix = Datastores.Shard.client_to_schema(client)
    opts = Keyword.merge(opts,
      dynamic_repo: dynamic_repo,
      prefix: prefix,
      migration_lock: false # Migration lock doesn't honor prefix... :(
    )

    Ecto.Migrator.run(Datastores.Shard, migrations, dir, opts)
  end

  def load_migrations() do
    migrations_path = Datastores.Shard.config(:db01)
    |> Keyword.fetch!(:priv)

    Path.wildcard("#{migrations_path}/**/*.exs")
    |> Enum.map(fn path ->
      [{module, _code}] = Code.compile_file(path)
      [version, _trash] = path |> Path.basename() |> String.split("_", parts: 2)
      version = String.to_integer(version)
      {version, module}
    end)
    |> Enum.sort()
  end

end

Basically…

Start up all the dynamic repos.
Query our controller database (regular Ecto repo) for list of tenants.
Load all migration scripts from priv.
Run all of them in Task.async_stream.

Datastores.set is our version of Tenant.set and it does the whole put_dynamic_repo thing. Each tenant is a Postgres schema, hence we also need to set prefix

Note the # Migration lock doesn't honor prefix... :( comment. I think that’s a bug in Ecto and why I say things almost work perfectly. This code is pretty old though, I wonder if it’s been fixed.

Edit: I didn’t realize this was a year old. Discourse’s notifications are confusing to me…

chasers · November 9, 2023, 10:02pm

Ooooooh interesting, I think we might be encountering this too occasionally.