Triplex - a complete solution to multi-tenancy with PG schemas

Hey everyone!

Well, we made this lib a while ago and now we decided to finally go out and public with it! It’s a tool for creating and managing multi-tenancy applications using postgres schemas.

We’ve recently released version 1.1.2 with support for ecto 2.2.0-rc. :slight_smile:

I know there are some other options of libraries for the same purpose, like https://github.com/Dania02525/apartmentex and https://github.com/promptworks/tenantex. At the time we wrote this lib, the other libraries were a little bit out of date with their pull requests, didn’t have everything we wanted and, the worst for us, were a little bit intrusive (I’ll talk about it later).

So we decided to write a simple one, and now here we are, another option for you. Here is a list of features we missed on the other libraries:

  • Mix tasks for migrating and rollbacking all schemas
  • Plugs to load and ensure if the tenant is loaded from a param, session value or subdomain
  • A way to infer the name of the tenant from a given struct which represents the tenant on your application
  • Smarter configuration like:
    • Set the default Repo where to execute the tenant management tasks
    • Tenant names that must be reserved, like api or www if you’re using subdomains to load your tenant

There are also some differences in the way the libs handle the queries and commands with the prefix. While tenantex and apartmentex are a little bit intrusive to your Repo or the use of it, on triplex we tried to stay the least intrusive we could.

For example, here is how to make a query applying a prefix from an tenant struct called org on raw ecto:

Repo.all(User, prefix: org.subdomain)

Now the same query using a properly configured triplex:

Repo.all(User, prefix: Triplex.to_prefix(org))

The same with tenantex:

# first you need to change your repo to use their repo not ecto's
defmodule Repo do
  use Tenantex.Repo, otp_app: :your_app
end

Repo.all(User, prefix: org)

Finally, with apartmentex:

Apartmentex.all(Repo, User, org)

The reason behind this decision is simple: being less intrusive makes triplex a lot easier to upgrade for future versions of ecto.

And one last thing: we got our docs a little bit further! That’s a bonus for anyone who wants to use it! :wink:

Feel free to try it and send feedbacks for us!

23 Likes

I just did a mix phx.new triplex_test using Phoenix 1.3 and got this error.

$ mix deps.get
Running dependency resolution...

Failed to use "plug" (version 1.4.3) because
phoenix (version 1.3.0) requires ~> 1.3.3 or ~> 1.4
phoenix_ecto (version 3.2.3) requires ~> 1.0
phoenix_html (version 2.10.3) requires ~> 1.0
triplex (version 1.1.2) requires ~> 1.3.5
mix.lock specifies 1.4.3

Can the required plug version relaxed?

2 Likes

Yes! For sure, sorry about that! Version 1.1.3 released with a relaxed plug dependency.

And thanks for the report!

Sweet! I’ll give it a try… thanks! :slight_smile:

1 Like

This looks great!

If you have the time, would you be able to write a quick article with an example using it?

Thanks

1 Like

Sure! I will take some time this week for it. :slight_smile:

2 Likes

I just saw the elixir weekly showing of the package and it seems running queries on a single tenant is super slick. How about adding some helpers to allow for cross-tenant queries. I’ve not much experience with postgres, but the following sql seems to allow for that:

SELECT id, 'tenant_01' AS tenant from tenant_01.some_table
UNION ALL
SELECT id, 'tenant_02' AS tenant from tenant_02.some_table

Especially if tenant table’s are simple clones of each other there shouldn’t be any missmatch between the selects.

Edit: Seems this would first need some work in ecto: https://github.com/elixir-ecto/ecto/issues/1549

2 Likes

Hi @kelvinst, do you thought about other approaches to multi-tenancy? I totally bought it until read a critical blog post by the creators of a recognized ruby gem that uses schema based multi-tenancy and the possible solution. Would be great to know your opinion, given the effort to develop Triplex.

3 Likes

Hi, @marciol!

Well, we’ve bought the idea on a previous project, but didn’t come to the point of having the article’s mentioned issues. But well, I guess that there’s no perfect solution. Everything will have its pros and cons and the post you mention is focused on the cons.

While all of the cons are true (I guess), there are also some pros, and one of them (which made me decide for this technique) is the productivity and code organization. I guess that the pros I had on the start of the project totally compensate some issues running migrations or having to pay some extra money when I have more customers (I guess with more customers I’d have more money too).

The other cons he list on the article are easy to workaround, and he mentions how to do it on the end of the article.

Coming to a conclusion: “there’s no silver bullet” and you must know that before using any tool. The tools are made to solve one or two problems, not all of them. I guess that, once you start having problem with the tool, it can be an indicator you’re using it wrongly or it’s the wrong tool for the job. Using PG schemas to separate your tenants is good in some cases, but for a lot of data in lot of tables and with a lot of changes on the db, there are better alternatives for sure.

5 Likes

Yep! It’s actually the only place that needs work, since that to execute queries inside the tenants we don’t use anything from Triplex, and yes a common option on Ecto queries and commands.

I’m pretty sold on the separate schema multi-tenancy solution. The per table tenant_id column approach (which I’m using in a python app) works well until one of your tenants does something requiring you to restore their data from a backup. I’ll leave that potential nightmare to your imagination to sort out. With separate schemas this problem doesn’t exist. Each schema can be independently restored and/or moved to a different PostgreSQL instance if needed.

Instead of UUID I decided to use a “Snowflake-style system” that uses a PostgreSQL function to generate a big integer which is unique and increments based on time so sorting isn’t an issue. Just use a different shard_id per tenant (even per PostgreSQL instance) to be sure of uniqueness in case you have a need to relocate the schema later. The really cool thing is that given any ID you can extract the shard_id and determine the tenant that it belongs to (if tenant_id == shard_id). This can potentially save some database queries in some cases.

If you’re building a JSON API and using JavaScript on your front-end (SPA etc.) you need to convert the bigint ID’s to strings before encoding to JSON as JavaScript can’t properly handle the bigint ID’s. It will round them causing you fits until you figure that bit out. You can use an Ecto custom type to do the conversion automatically allowing you to pass ID’s as strings or integers in your queries but always get the ID’s as strings in the query result so you can just pass it directly to Poison etc.

These are the “pros” that sold me but admittedly I haven’t had any experience with this solution at the scale of the blog author, but I would think that with a little planning you could utilize several PostgreSQL instances to solve most of those issues since you can at least move the separate schemas between instances etc. I’ve also looked a table inheritance as a potential solution to some problems. Table inheritance is pretty cool on it’s own once you get to looking at it. It just depends on how much complexity you want to introduce in the long run.

@kelvinst - Thank you for Triplex! It’s simple, stays out of your way and gets the job done. :grinning:

4 Likes

@kelvinst does this library supports using mysql multiple databases instead of PG schemas?

1 Like

@emoragaf No. Not yet, I have not implemented since we didn’t use MySQL.

Maybe in the future we will have it, but it’s not in our plans to implement it now.

If you (or anyone else) know how to do it, and have time to contribute, please feel free to open a PR and it will be accepted for sure!

I’m curious to see an example of this approach, one part that’s not clear is how to create the function inside tenant_migrations. In order to create the id generator function, I’d need to be able to do something like:

execute """
CREATE FUNCTION #{prefix}.id_generator ......
....
tenant_id := #{TenantHelper.tenant_handler(prefix).id};
....
"""

But I can’t see how to find the prefix that’s running on a migration, any ideas?

There is the function Ecto.Migration.prefix/0 which is available on all your migrations and returns the prefix of every migration.

Just curious, what is this? :grimacing:

Yes. As @kelvinst points out you can use the prefix() function in your migrations. Just change #{prefix} in your code to #{prefix()} and you’re set.

To set the shard_id I’ve been using a private function in the migration file to split the prefix to grab the id from a prefix something like "Tenant_1":

defp shard_id() do
    [_, id] = prefix() |> String.split("_")
    id
end

BTW - Here is the function to extract the shard_id from a generated id:

...

def extract(item_id) when is_integer(item_id) do
    ((item_id ^^^ ((item_id >>> 23) <<< 23 )) >>> 10)
end

That function was ripped from Ecto.InstaShard. You may want to have a look around that repo as well. There’s some really interesting things to look at there.

Here’s the entire migration I put together for the id generator for anyone that might be interested:

defmodule YourModule.Repo.Migrations.CreateSnowflake do
  use Ecto.Migration

  def up do
    execute """
    CREATE SEQUENCE #{prefix()}.next_id_seq;
    """

    execute """
    CREATE OR REPLACE FUNCTION #{prefix()}.next_id(OUT result bigint) AS $$
    DECLARE
      our_epoch bigint := 1000000000000;  -- <-Set this to your epoch
      seq_id bigint;
      now_millis bigint;
      shard_id int := #{shard_id()};  -- Set int value here per shard
      BEGIN
        SELECT nextval('#{prefix()}.next_id_seq') % 1024 INTO seq_id;

        SELECT FLOOR(EXTRACT(EPOCH FROM clock_timestamp()) * 1000) INTO now_millis;
        result := (now_millis - our_epoch) << 23;
        result := result | (shard_id << 10);
        result := result | (seq_id);
      END;
    $$ LANGUAGE PLPGSQL;
    """
  end

  def down do
    execute """
    DROP SEQUENCE #{prefix()}.next_id_seq;
    """

    execute """
    DROP FUNCTION #{prefix()}.next_id;
    """
  end

  defp shard_id() do
    [_, id] = prefix() |> String.split("_")
    id
  end

end
4 Likes

Wow, this is very helpful, thank you!

@kelvinst this was my idea to get the tenant id from the database, but @l00ker’s solution of parsing the prefix string works for me.

2 Likes

That was what I suspected. Not a very good solution IMO, but I can’t think in another one for it, so “let it be”. :confused:

FYI – this should be

execute """
DROP FUNCTION #{prefix()}.next_id();
"""
2 Likes