Separate Sqlite Database per Customer?

AndyL · March 25, 2023, 3:25am

I’m writing an app that manages sensitive data. It might be nice to segregate each customer’s data into a separate Sqlite database.

I’ve done different flavors of multi-tenancy in Postgres, and have configured apps that use a pre-defined group of databases. But never have seen anyone attach, create & destroy Sqlite instances dynamically.

Is it even possible? Is anyone doing this?

dimitarvp · March 25, 2023, 5:51am

I’ve done it in Rust and Golang but not in Elixir yet. Very common practice in certain businesses.

bdarla · March 25, 2023, 7:28am

In general Sqlite does not provide strong security mechanisms. It might be inappropriate to handle sensitive data.

On the other hand, I do see the benefits if you need to delete a particular’s customer data easily.

fceruti · March 25, 2023, 7:50am

I’m pretty sure everything you need can be achieved in Elixir/Phoenix, yet, I must say, I have no experience with it.

Having said that, if instead of calling Repo in your project, you do something akin to get_repo(user, team), where you lazy create that database if it doesn’t exist (and check permissions to access), that should sandboxed enough.

If you think about it, it’s much more secure than other approaches, where a web framework defaults to a db connection. Here, every, single, time, you need to explicitly tell elixir what database to use, by calling a configured Repo (you can configure more than one).

dimitarvp · March 25, 2023, 8:22am

Ecto also allows you dynamic repositories.

ltd · March 25, 2023, 10:48am

At scale this becomes less appealing. There is a cost to open/close each db. At scale you won’t be able to keep all of them open all the time. With any kind of concurrent usage of a single db, your app will need to coordinate who the “writer” is. If your app moves to multiple nodes, the work will need to be routed to the node that hosts the sqllite db. If your app is transaction heavy, you will end up doing alot more i/o, the syncing of the single postgres WAL is significantly more efficient than an array of sqlite dbs.

AndyL · March 25, 2023, 3:51pm

Ecto also allows you dynamic repositories .

This looks promising. Thanks for the tip !!

arcadian · March 26, 2023, 10:17am

:dets might be an option since it uses one file per key-value store and has simple commands to open and close the files.

AndyL · March 26, 2023, 2:58pm

:dets is simple to use for dynamic k/v stores…

Yes! Also CubDB.

AndyL · March 27, 2023, 4:23am

@dimitarvp thanks again for the reference to dynamic repos. Using this info, I wrote an prototype library called DynDb to manage Dynamic Sqlite databases. Interested to get feedback re: API design and potential gotchas. Have a look!

#!/usr/bin/env elixir 

Mix.install([
  {:dyn_db, github: "andyl/dyn_db"}
])

# Write a Migration 
defmodule Migration0 do
  use Ecto.Migration

  def change do
    create table("chats") do
      add(:message, :string)
      timestamps(type: :utc_datetime_usec)
    end
  end
end

# Write a Schema 
defmodule Chat do
  use Ecto.Schema

  schema "chats" do
    field(:message, :string)
    timestamps(type: :utc_datetime_usec)
  end
end

# Interact with the DynDb
defmodule Main do 
  def main do 
    # The database file will be auto-created
    {:ok, db1} = DynDb.start_link(database: "./data1.db")

    # Migrate 
    DynDb.migrate(db1, Migration0) 

    # Query  
    DynDb.insert!(db1, %Chat{message: "HELLO at #{Time.utc_now()} UTC"})
    DynDb.all(db1, Chat) |> IO.inspect()
  end 
end 

Main.main()

dimitarvp · March 27, 2023, 5:16am

Looks alright for a start. I’d have 80% of the code be tests though. But that’s for later stage.

trarbr · March 27, 2023, 5:19am

Alex Koutmos played with something similar previously, maybe this is useful? https://twitter.com/akoutmos/status/1495830005267542027

akoutmos · August 8, 2023, 6:01pm

A little late to the party here, but the proof of concept repo is open source and available for perusing

homanchou · November 8, 2023, 5:27pm

Could you say a little more about the overall design and thought process in terms of scalability and failover?

And my understanding is that sqlite can only have one connection, which might be fine if you guard access using a single genserver.

When using multiple SQLite databases, which are usually on same file-system as the application accessing it, how do then move to supporting a multi-node configuration?

Then does every node mount a shared drive or file system to access the sqlite files or is there a primary node that opens connections to databases and every other node connects to that node? Or is there a global registry that is synced with all connected nodes, and each node has access to the same mounted file system and every node can checkout and connect to databases but the registry will only allow one node to connect to a database at a time?

warmwaffles · November 8, 2023, 10:18pm

This is an old constraint. As long as you open your database with journal_mode: :wal you should be fine with multiple processes writing to the same database. Where you WILL run into potential issues is if you are writing to a database file that is on an NFS mount. Then your file system locks won’t be perfect. I haven’t had any issues yet but I suspect if I have network issues that NFS mount is going to suffer.

With how ecto_sqlite3 and exqlite are currently built, it uses DBConnection to pool the sqlite file handles. We have had discussions about moving to a single genserver for writing and reads going to another genserver(s). But right now each node has its own pool of handles open for the database.

I have a personal project I’ve been toying with where I have thousands of sqlite databases. The data is partitioned that way so that I can do a really dumb map reduce. I have it setup to use dynamic repos which causes the connection pool to go down to 1 and execute on that. All of the databases are local to the instance and replicated out to a file server for backups. I avoided opening the database over NFS for the fear of network gremlins.

There are a lot of ways to slice this, and just make sure you have a plan in place to deal with the faults that will happen.

dimitarvp · November 9, 2023, 11:18pm

SQLite has 3 modes of opening a “connection”:

A handle usable only by a single OS thread;
A handle usable by multiple OS threads;
Multiple handles all pointing to the same SQLite database file.

Like @warmwaffles says, you can also open the DB in wal or wal2 mode in combination with any of options 2 and 3 and then your SQLite is effectively like a local Postgres. It would still struggle if there was a big amount of writes per second – I reckon 2000 or more – but most apps needing SQLite don’t do that so…

andreh · December 22, 2023, 5:56pm

Hi, this is really nice, thanks! I’ve tried to create 100 users using this seeds.exs:

Enum.map(1..100, fn k ->
  IO.puts("Creating user #{k}")

  {:ok, user} =
    SqliteScale.Accounts.register_user(%{
      email: "user#{k}@dot.com",
      password: "passpasspasspass"
    })
end)

Unfortunately, I get this error message:

Creating user 91
[debug] QUERY OK source="users" db=0.0ms idle=367.7ms
SELECT 1 FROM "users" AS u0 WHERE (u0."email" = ?) LIMIT 1 ["user91@dot.com"]
[debug] QUERY OK db=0.3ms idle=519.8ms
INSERT INTO "users" ("email","hashed_password","inserted_at","updated_at","id") VALUES (?,?,?,?,?) ["user91@dot.com", "$2b$12$Cpfw1ZaIVOiyoipYKMZlGe0AzDOryNUfg/R0uuNHr0eu7g2W/nOvG", ~N[2023-12-22 17:49:46], ~N[2023-12-22 17:49:46], "61079def-1734-4f8a-91d2-21a3edec047a"]
[error] Exqlite.Connection (#PID<0.1316.0>) failed to connect: ** (Exqlite.Error) database_open_failed
[error] Exqlite.Connection (#PID<0.1314.0>) failed to connect: ** (Exqlite.Error) unable to open database file
[error] Could not create schema migrations table. This error usually happens due to the following:

  * The database does not exist
  * The "schema_migrations" table, which Ecto uses for managing
    migrations, was defined by another library
  * There is a deadlock while migrating (such as using concurrent
    indexes with a migration_lock)

[...]

** (Exqlite.Error) unable to open database file
CREATE TABLE IF NOT EXISTS "schema_migrations" ("version" INTEGER PRIMARY KEY, "inserted_at" TEXT)
    (ecto_sql 3.10.2) lib/ecto/adapters/sql.ex:1047: Ecto.Adapters.SQL.raise_sql_call_error/1
    (elixir 1.15.7) lib/enum.ex:1693: Enum."-map/2-lists^map/1-1-"/2
    (ecto_sql 3.10.2) lib/ecto/adapters/sql.ex:1154: Ecto.Adapters.SQL.execute_ddl/4
    (ecto_sql 3.10.2) lib/ecto/migrator.ex:756: Ecto.Migrator.verbose_schema_migration/3
    (ecto_sql 3.10.2) lib/ecto/migrator.ex:564: Ecto.Migrator.lock_for_migrations/4
    (ecto_sql 3.10.2) lib/ecto/migrator.ex:433: Ecto.Migrator.run/4
    (sqlite_scale 0.1.0) lib/sqlite_scale/user_repo.ex:16: SqliteScale.UserRepo.with_dynamic_repo/2
    (sqlite_scale 0.1.0) lib/sqlite_scale/dynamic_repo_supervisor/repo_supervisor.ex:72: SqliteScale.DynamicRepoSupervisor.RepoSupervisor.run_migrations/2
    (sqlite_scale 0.1.0) lib/sqlite_scale/dynamic_repo_supervisor/repo_supervisor.ex:53: SqliteScale.DynamicRepoSupervisor.RepoSupervisor.add_repo_to_supervisor/1
    (sqlite_scale 0.1.0) lib/sqlite_scale/accounts.ex:88: SqliteScale.Accounts.register_user/1
    priv/repo/seeds.exs:16: anonymous fn/1 in :elixir_compiler_1.__FILE__/1
    (elixir 1.15.7) lib/enum.ex:4356: Enum.map_range/4
    (elixir 1.15.7) lib/enum.ex:4356: Enum.map_range/4
    (elixir 1.15.7) lib/enum.ex:4356: Enum.map/2
    priv/repo/seeds.exs:12: (file)

I can’t believe 90 files could exhaust the number of file descriptors that the BEAM can handle! Any idea? I’d like to have ~1000 databases, I hope it’s possible!

andreh · December 24, 2023, 2:31pm

Alright, this solves the problem:

ulimit -n 524288