Triplex - a complete solution to multi-tenancy with PG schemas

kelvinst · August 12, 2017, 9:38pm

Hey everyone!

Well, we made this lib a while ago and now we decided to finally go out and public with it! It’s a tool for creating and managing multi-tenancy applications using postgres schemas.

We’ve recently released version 1.1.2 with support for ecto 2.2.0-rc.

I know there are some other options of libraries for the same purpose, like https://github.com/Dania02525/apartmentex and https://github.com/promptworks/tenantex. At the time we wrote this lib, the other libraries were a little bit out of date with their pull requests, didn’t have everything we wanted and, the worst for us, were a little bit intrusive (I’ll talk about it later).

So we decided to write a simple one, and now here we are, another option for you. Here is a list of features we missed on the other libraries:

Mix tasks for migrating and rollbacking all schemas
Plugs to load and ensure if the tenant is loaded from a param, session value or subdomain
A way to infer the name of the tenant from a given struct which represents the tenant on your application
Smarter configuration like:
- Set the default Repo where to execute the tenant management tasks
- Tenant names that must be reserved, like api or www if you’re using subdomains to load your tenant

There are also some differences in the way the libs handle the queries and commands with the prefix. While tenantex and apartmentex are a little bit intrusive to your Repo or the use of it, on triplex we tried to stay the least intrusive we could.

For example, here is how to make a query applying a prefix from an tenant struct called org on raw ecto:

Repo.all(User, prefix: org.subdomain)

Now the same query using a properly configured triplex:

Repo.all(User, prefix: Triplex.to_prefix(org))

The same with tenantex:

# first you need to change your repo to use their repo not ecto's
defmodule Repo do
  use Tenantex.Repo, otp_app: :your_app
end

Repo.all(User, prefix: org)

Finally, with apartmentex:

Apartmentex.all(Repo, User, org)

The reason behind this decision is simple: being less intrusive makes triplex a lot easier to upgrade for future versions of ecto.

And one last thing: we got our docs a little bit further! That’s a bonus for anyone who wants to use it!

Feel free to try it and send feedbacks for us!

l00ker · August 13, 2017, 6:21pm

I just did a mix phx.new triplex_test using Phoenix 1.3 and got this error.

$ mix deps.get
Running dependency resolution...

Failed to use "plug" (version 1.4.3) because
phoenix (version 1.3.0) requires ~> 1.3.3 or ~> 1.4
phoenix_ecto (version 3.2.3) requires ~> 1.0
phoenix_html (version 2.10.3) requires ~> 1.0
triplex (version 1.1.2) requires ~> 1.3.5
mix.lock specifies 1.4.3

Can the required plug version relaxed?

kelvinst · August 13, 2017, 7:42pm

Yes! For sure, sorry about that! Version 1.1.3 released with a relaxed plug dependency.

And thanks for the report!

l00ker · August 13, 2017, 8:02pm

Sweet! I’ll give it a try… thanks!

BarelyFunctional · August 13, 2017, 11:56pm

This looks great!

If you have the time, would you be able to write a quick article with an example using it?

Thanks

kelvinst · August 14, 2017, 12:48pm

Sure! I will take some time this week for it.

LostKobrakai · September 12, 2017, 1:14pm

I just saw the elixir weekly showing of the package and it seems running queries on a single tenant is super slick. How about adding some helpers to allow for cross-tenant queries. I’ve not much experience with postgres, but the following sql seems to allow for that:

SELECT id, 'tenant_01' AS tenant from tenant_01.some_table
UNION ALL
SELECT id, 'tenant_02' AS tenant from tenant_02.some_table

Especially if tenant table’s are simple clones of each other there shouldn’t be any missmatch between the selects.

Edit: Seems this would first need some work in ecto: https://github.com/elixir-ecto/ecto/issues/1549

marciol · September 13, 2017, 5:57pm

Hi @kelvinst, do you thought about other approaches to multi-tenancy? I totally bought it until read a critical blog post by the creators of a recognized ruby gem that uses schema based multi-tenancy and the possible solution. Would be great to know your opinion, given the effort to develop Triplex.

kelvinst · September 13, 2017, 6:53pm

Hi, @marciol!

Well, we’ve bought the idea on a previous project, but didn’t come to the point of having the article’s mentioned issues. But well, I guess that there’s no perfect solution. Everything will have its pros and cons and the post you mention is focused on the cons.

While all of the cons are true (I guess), there are also some pros, and one of them (which made me decide for this technique) is the productivity and code organization. I guess that the pros I had on the start of the project totally compensate some issues running migrations or having to pay some extra money when I have more customers (I guess with more customers I’d have more money too).

The other cons he list on the article are easy to workaround, and he mentions how to do it on the end of the article.

Coming to a conclusion: “there’s no silver bullet” and you must know that before using any tool. The tools are made to solve one or two problems, not all of them. I guess that, once you start having problem with the tool, it can be an indicator you’re using it wrongly or it’s the wrong tool for the job. Using PG schemas to separate your tenants is good in some cases, but for a lot of data in lot of tables and with a lot of changes on the db, there are better alternatives for sure.

kelvinst · September 13, 2017, 6:55pm

Yep! It’s actually the only place that needs work, since that to execute queries inside the tenants we don’t use anything from Triplex, and yes a common option on Ecto queries and commands.

l00ker · September 16, 2017, 10:08pm

I’m pretty sold on the separate schema multi-tenancy solution. The per table tenant_id column approach (which I’m using in a python app) works well until one of your tenants does something requiring you to restore their data from a backup. I’ll leave that potential nightmare to your imagination to sort out. With separate schemas this problem doesn’t exist. Each schema can be independently restored and/or moved to a different PostgreSQL instance if needed.

Instead of UUID I decided to use a “Snowflake-style system” that uses a PostgreSQL function to generate a big integer which is unique and increments based on time so sorting isn’t an issue. Just use a different shard_id per tenant (even per PostgreSQL instance) to be sure of uniqueness in case you have a need to relocate the schema later. The really cool thing is that given any ID you can extract the shard_id and determine the tenant that it belongs to (if tenant_id == shard_id). This can potentially save some database queries in some cases.

If you’re building a JSON API and using JavaScript on your front-end (SPA etc.) you need to convert the bigint ID’s to strings before encoding to JSON as JavaScript can’t properly handle the bigint ID’s. It will round them causing you fits until you figure that bit out. You can use an Ecto custom type to do the conversion automatically allowing you to pass ID’s as strings or integers in your queries but always get the ID’s as strings in the query result so you can just pass it directly to Poison etc.

These are the “pros” that sold me but admittedly I haven’t had any experience with this solution at the scale of the blog author, but I would think that with a little planning you could utilize several PostgreSQL instances to solve most of those issues since you can at least move the separate schemas between instances etc. I’ve also looked a table inheritance as a potential solution to some problems. Table inheritance is pretty cool on it’s own once you get to looking at it. It just depends on how much complexity you want to introduce in the long run.

@kelvinst - Thank you for Triplex! It’s simple, stays out of your way and gets the job done.

emoragaf · September 21, 2017, 3:02pm

@kelvinst does this library supports using mysql multiple databases instead of PG schemas?

kelvinst · September 21, 2017, 4:44pm

@emoragaf No. Not yet, I have not implemented since we didn’t use MySQL.

Maybe in the future we will have it, but it’s not in our plans to implement it now.

If you (or anyone else) know how to do it, and have time to contribute, please feel free to open a PR and it will be accepted for sure!

tomconroy · September 29, 2017, 5:08pm

I’m curious to see an example of this approach, one part that’s not clear is how to create the function inside tenant_migrations. In order to create the id generator function, I’d need to be able to do something like:

execute """
CREATE FUNCTION #{prefix}.id_generator ......
....
tenant_id := #{TenantHelper.tenant_handler(prefix).id};
....
"""

But I can’t see how to find the prefix that’s running on a migration, any ideas?

kelvinst · September 29, 2017, 5:12pm

There is the function Ecto.Migration.prefix/0 which is available on all your migrations and returns the prefix of every migration.

Just curious, what is this?

l00ker · September 29, 2017, 5:51pm

Yes. As @kelvinst points out you can use the prefix() function in your migrations. Just change #{prefix} in your code to #{prefix()} and you’re set.

To set the shard_id I’ve been using a private function in the migration file to split the prefix to grab the id from a prefix something like "Tenant_1":

defp shard_id() do
    [_, id] = prefix() |> String.split("_")
    id
end

BTW - Here is the function to extract the shard_id from a generated id:

...

def extract(item_id) when is_integer(item_id) do
    ((item_id ^^^ ((item_id >>> 23) <<< 23 )) >>> 10)
end

That function was ripped from Ecto.InstaShard. You may want to have a look around that repo as well. There’s some really interesting things to look at there.

l00ker · September 29, 2017, 6:35pm

Here’s the entire migration I put together for the id generator for anyone that might be interested:

defmodule YourModule.Repo.Migrations.CreateSnowflake do
  use Ecto.Migration

  def up do
    execute """
    CREATE SEQUENCE #{prefix()}.next_id_seq;
    """

    execute """
    CREATE OR REPLACE FUNCTION #{prefix()}.next_id(OUT result bigint) AS $$
    DECLARE
      our_epoch bigint := 1000000000000;  -- <-Set this to your epoch
      seq_id bigint;
      now_millis bigint;
      shard_id int := #{shard_id()};  -- Set int value here per shard
      BEGIN
        SELECT nextval('#{prefix()}.next_id_seq') % 1024 INTO seq_id;

        SELECT FLOOR(EXTRACT(EPOCH FROM clock_timestamp()) * 1000) INTO now_millis;
        result := (now_millis - our_epoch) << 23;
        result := result | (shard_id << 10);
        result := result | (seq_id);
      END;
    $$ LANGUAGE PLPGSQL;
    """
  end

  def down do
    execute """
    DROP SEQUENCE #{prefix()}.next_id_seq;
    """

    execute """
    DROP FUNCTION #{prefix()}.next_id;
    """
  end

  defp shard_id() do
    [_, id] = prefix() |> String.split("_")
    id
  end

end

tomconroy · September 29, 2017, 9:10pm

Wow, this is very helpful, thank you!

@kelvinst this was my idea to get the tenant id from the database, but @l00ker’s solution of parsing the prefix string works for me.

kelvinst · September 30, 2017, 6:55pm

That was what I suspected. Not a very good solution IMO, but I can’t think in another one for it, so “let it be”.

tomconroy · October 2, 2017, 6:17pm

FYI – this should be

execute """
DROP FUNCTION #{prefix()}.next_id();
"""