Multitenancy shared vs seperate schema

travisf · June 19, 2024, 12:46pm

I’m likely going to be converting a rather large sales application to a multi-tenant architecture in the next few weeks and I’m curious about everyone’s thoughts between shared vs separate schemas?
A few qualifiers:

The app is fairly low traffic (it’s a B2B sales portal), I don’t think it would ever have more than a few hundred users at one time.
Right away there will be two tenants however the owner is hoping to scale up quickly, it’s unlikely we’d ever have more than 100 tenants though but I’d expect double digits to be coming within a year or two.

A few years ago I set up a multi-location function which used shared schemas and every record was associated with a location ID, I like that approach but it was a bit cumbersome to set it up originally. On the flip side Triplex seems like an absolute joy to use and I’ve already run some migrations on a test branch and for the most part it’s been a breeze.
Any thoughts on the pros/cons of either approach?

timothy · June 19, 2024, 1:31pm

I never used Triplex, so I can’t say anything on that, but Ecto has a few guides on handling Multi tenancy:

The company I work at uses FK on all schemas similar to the second guide of Ecto (shared schema). However we never implemented the prepare_query/3 to add the tenant id to all queries of a repo, since we didn’t like the “magic” of it. Instead we just pass the tenant struct to each function that controls anything related to a tenant (e.g. list_users(%Tenant{id: tenant_id}), do: %User{} |> where([u], u.tenant_id == ^tenant_id) |> Repo.all()). It is a bit cumbersome but does provide flexibility and is a bit more explicit on what’s happening. It just comes down to personal taste I guess.

fmn · June 19, 2024, 1:31pm

hi there, not sure if:

you went through it already.
you found/will find it relevant.

but here’s, kind of related - and interesting IMO - exchange: Dynamic database creation for SaaS product platform : Need help

mlg-hub · June 19, 2024, 2:05pm

Hello … thank you … we did a small prototype based on the informations shared from that conversation

travisf · June 19, 2024, 4:39pm

I didn’t realize Ecto had guides for this! That’s awesome, I read through the Triplex code this morning and frankly I don’t think there is a lot going on there that you can’t just do with Ecto.
Initially, I leaned heavily on the query prefixes approach because this is a large codebase and scoping and adding an organization ID to every record is going to be a headache. However, that “magic” you mention with prepare_query may be a good way to find/prevent unscoped queries.

travisf · June 20, 2024, 3:43pm

I’ve been looking into this a bit more I think the multi-schema approach may work, however the major blocker I’m facing is how to move existing data into the new schemas? At the moment the app already has two customers that I need to somehow move into their respective schemas.
I’ve been looking at just doing this with SQL (INSERT into prefix1.customers select * from public.customer where: customer.location = "1234") or using Ecto.put_meta/2 to update the prefix and then insert it deleting the original record afterwards.
However; with both approaches you quickly run into constraint errors which wouldn’t be a problem for a few associations but this is a big legacy app it’s kind of turtles all the way down. Do I just need to bite the bullet and write all these as some kind of complex migration script or is there a better way?