Can ecto repos be started dynamically?

idi527 · September 8, 2018, 9:26pm

Like under a dynamic supervisor?

I need this to manage several separate but ultimately very similar (same schema, etc) databases.

josevalim · September 8, 2018, 9:29pm

This is possible in Ecto master. You will still have to assemble a dynamic repo yourself using private APIs (so it is very unstable and may break at any moment) but the foundation is there for those willing to give it a try. The whole discussion and a quick sample are here: https://github.com/elixir-ecto/ecto/issues/1964. With enough feedback, we will gladly wrap it up and expose it publicly.

idi527 · September 8, 2018, 9:30pm

Thank you, I will give it a try.

idi527 · September 9, 2018, 7:20am

I’m a bit stuck with migrations. The migrator module seems to only work with a module repo, how do I pass a pid (from Repo.start_link) to run/3?

idi527 · September 9, 2018, 7:25am

I probably need to define DynamicRepo.config/0 …

Something like the following?

defmodule DynamicRepo do
  # ...

  def config do
    # definitely a hack
    database = :erlang.get(:current_migration_database) # set in the module running Ecto.Migrator.run/3?
    [{:database, database} | Repo.config()]
  end
end

Getting

** (RuntimeError) could not lookup DynamicRepo because it was not started or it does not exist
    (ecto) lib/ecto/repo/registry.ex:18: Ecto.Repo.Registry.lookup/1
    (ecto) lib/ecto/adapter.ex:91: Ecto.Adapter.lookup_meta/1
    (ecto) lib/ecto/migration/schema_migration.ex:19: Ecto.Migration.SchemaMigration.ensure_schema_migrations_table!/2
    (ecto) lib/ecto/migrator.ex:453: Ecto.Migrator.verbose_schema_migration/3
    (ecto) lib/ecto/migrator.ex:237: Ecto.Migrator.run/4

Should I somehow pass the name I’ve started my “dynamic” repo with – Repo.start_link(name: name) – to Ecto.Repo.Registry.lookup/1?

~~Hm, I guess it wouldn’t matter … Ecto.Repo.Registry is empty anyway~~

Actually, it’s not

iex(10)> {:ok, pid} = Repo.start_link(name: :db2, database: "asdf2.db")
{:ok, #PID<0.254.0>}
iex(11)> :ets.i Elixir.Ecto.Repo.Registry
<1   > {<0.254.0>,#Ref<0.881390119.980680706.3317>,
 {'Elixir.Sqlite.Ecto2  ...

Now stuck with

** (UndefinedFunctionError) function DynamicRepo.all/1 is undefined or private
    (dynamic_repo) DynamicRepo.all(#Ecto.Query<from s in "schema_migrations", lock: "FOR UPDATE", select: type(s.version, :integer)>)
    (ecto) lib/ecto/migrator.ex:285: anonymous fn/3 in Ecto.Migrator.lock_for_migrations/3
    (ecto) lib/ecto/adapters/sql.ex:645: anonymous fn/3 in Ecto.Adapters.SQL.do_transaction/3
    (db_connection) lib/db_connection.ex:1388: DBConnection.run_transaction/4
    (ecto) lib/ecto/adapters/sql.ex:687: Ecto.Adapters.SQL.lock_for_migrations/4
    (ecto) lib/ecto/migrator.ex:285: Ecto.Migrator.lock_for_migrations/3

I can try passing the pid of the process from the process dictionary again so that DynamicRepo.all/1 calls DynamicRepo.all(pid, query), but in the end I might get to a point where the process executing the migration is different from the one where my :current_migration_database is stored …

I guess I’ll use the database adapter directly for running migrations.

idi527 · September 9, 2018, 12:46pm

Is it possible to start a repo with its name being a via tuple?

Repo.start_link(name: {:via, Registry, {:db_registry, "asdf.db"}}, database: "asdf.db")

The above seems to fail.

** (EXIT from #PID<0.238.0>) shell process exited with reason: an exception was raised:
    ** (ArgumentError) argument error
        (stdlib) :ets.new({:via, Registry, {:db_registry, "asdf.db"}}, [:set, :public, {:read_concurrency, true}])

idi527 · September 9, 2018, 1:20pm

Also, the changelog says that ecto v3 would start locking tables:

Running migrations will now lock the migrations table, allowing you to concurrently run migrations in a cluster without worrying that two servers will race each other or without running migrations twice.

which I think is not possible with sqlite. Is there a way to disable that feature?

In the migrations I ran ecto did try locking some tables which resulted in the following error:

Generated dynamic_repo app
** (ArgumentError) locks are not supported by SQLite
    lib/sqlite_ecto/connection.ex:80: Sqlite.Ecto2.Connection.all/1
    lib/sqlite_ecto.ex:37: Sqlite.Ecto2.prepare/2
    (ecto) lib/ecto/query/planner.ex:196: Ecto.Query.Planner.query_without_cache/4
    (ecto) lib/ecto/query/planner.ex:165: Ecto.Query.Planner.query_prepare/6
    (ecto) lib/ecto/query/planner.ex:138: Ecto.Query.Planner.query_with_cache/7
    (ecto) lib/ecto/repo/queryable.ex:132: Ecto.Repo.Queryable.execute/4
    (ecto) lib/ecto/repo/queryable.ex:18: Ecto.Repo.Queryable.all/3
    (ecto) lib/ecto/migrator.ex:285: anonymous fn/3 in Ecto.Migrator.lock_for_migrations/3

LostKobrakai · September 9, 2018, 2:57pm

Ecto starts an ets table per repo (holding the config afaik) and therefore the name needs to be an atom not any term.

idi527 · September 9, 2018, 3:25pm

It would be great if it was possible to pass something like :unnamed option then and just pass the ownership of the ets table to the provided pid if that’s necessary. The ideal api as I see it would accept any term for all the repo functions like insert(repo \\ __MODULE__, struct, opts) and the like and use via tuples underneath.

Like

repo_id = 123
database = Application.app_dir(:dynamic_repo, "priv/dbs/#{repo_id}.sqlite3")
{:ok, that_db_pid} = Repo.start_link(name: {:via, Registry, {DynamicRepos, repo_id}}, database: database)

defmodule DynamicRepo dp
  def insert(repo_id, struct, opts) do
    case Registry.lookup(DynamicRepos, repo_id) do
      [{that_db_pid, database: _database}] -> # database is the path to the database, might be handy sometimes
         Ecto.Repo.Schema.insert(that_db_pid, struct, opts)
      [] ->
        {:error, :repo_not_found} # in my case I just start it
    end
  end
end

But I’m currently doing it manually.

It could also provide some basic sharding capabilities

user_id = 18273645
user_id
|> :erlang.phash2(repo_count - 1) # or lookup in a hash ring
|> DynamicRepo.insert(user_data)

josevalim · September 9, 2018, 5:12pm

Is it possible to start a repo with its name being a via tuple?

The name has to be an atom so far. But that’s OK if you are bound to a certain of names.

Also, the changelog says that ecto v3 would start locking tables:

The SQL lite adapter needs to be updated to implement a subset of the migration code. You may also be also to set :migration_lock or similar in your config repo.

But I don’t think DynamicRepo quite supports migrations. At least it was not one of the scenarios we considered so far, so there may be rough edges.

idi527 · September 9, 2018, 6:59pm

I’ve made a small demo for one of my use cases of DynamicRepo: https://github.com/servers-and-stuff/dynamic_repo_sqlite.

I’ll try to work a bit on the sqlite adapter and maybe the migrator to see how far I can get with this approach.

idi527 · October 21, 2018, 6:04am

As per @josevalim’s suggestion.

So I wanted to be able to dynamically open databases, for example, on first insert or read, and then close them after either some period of inactivity or manually without affecting other databases. For that I needed a repo per such a database. I’ve documented my first attempt above.

Since then I’ve moved on to working with esqlite directly.

what you are doing right now, the pros and cons

I suppose this is about my current approach (esqlite).

Pros: all the logic about opening / migrating / interacting / and then closing the databses is clear.

Cons: I have to write SQL manually.

As for why I’m doing it, I’m deploying to a variable environment where machines come and go so I needed a databases system that would “travel” together with the app, so I picked sqlite thus making the elixir app into a databases. But in order to avoid write conflicts, each resource gets it’s own database.

In the end I want to achieve something not quite unlike http://actordb.com but without raft or any other form of coordination.