Using a Postgres schema per Umbrella Application

Manzanit0 · April 28, 2021, 1:01pm

Hey all!

I’m currently working on pushing a bunch of different applications I have running in the cloud into an Umbrella, mostly so I can manage them as a whole.

One of the things I’ve been mulling over is that I would also like to use the same Postgres instance: at the end of the day it’s not a lot of data and it would make my life infinitely easier. However, since they ARE different applications (for example, a cooking web app and a telegram todo bot), I would like to seggregate the data into different Postgres schemas.

With my current knowledge, how I would do this is configure an Ecto.Repo per application and in each repo set the after_connect property such as: after_connect: {Postgrex, :query!, ["SET search_path TO cooking_app", []]}. This allows me to not have to deal with prefixes on a daily basis and have it isolated there as well as it just simply working out of the box with migrations and everything.

However, the main drawbacks I see to this approach are that (1) as I create more Ecto.Repo instances, I’m creating more connection pools, while I might want to share the same pool between all repos, and also that (2) I still haven’t figured out how to create the schemas easily before running the migrations from each Application.

How would you folks frame this problem? I’m interested in any thought, even if it isn’t necessarily a solution

Cheers,
Javier

dmitrykleymenov · April 28, 2021, 1:11pm

Why do you want to use the same pools? Cache is the main goal of pooling, and with completely different data it doesn’t matter one connect handles two different requests of two connects do the same. If total pool size is ok, ofc. I think so.

Manzanit0 · April 28, 2021, 1:14pm

Good point — I haven’t really looked into how pooling works, the only thing I was concerned about was that if each Repo’s pool opens up 10 connections, I might end up having 80 connections open and Postgres might not like it too much. But other than that, I haven’t looked into that topic to much yet.

dmitrykleymenov · April 28, 2021, 1:19pm

I think it’s even worse since your apps will use their own connections, so cache hit will be 8 times more for 10 per app, than 80 for “share-pooling-umbrella”. Even for 40, or less.

Manzanit0 · April 28, 2021, 2:44pm

There is always the possibility of implementing a custom pool, like this, which seems fairly lightweight.

And of course, there’s the option to use a shared Repo and create wrappers for each umbrella application which simply leverage the prefix option… but I’d like something a but less verbose.

cjbottaro · May 2, 2021, 6:06pm

We have a multi-tenanted application which has a similar problem, but even a little worse (we have like 800 Postgres schemas across 5 servers).

What we do is have 5 normal Ecto.Repo modules, one for each server. Then we wrote our own MultitenantRepo that (nearly) implements the Ecto.Repo behavior. Then we send our queries to that.

Ecto.Query.from(...)
|> MultitenantRepo.all()

Each function in MultitenantRepo figures out which normal repo to delegate to and how to set the proper :prefix if applicable.

It’s kind of a PITA because there are a lot of functions in the Ecto.Repo behaviour to implement, but with a little bit of clever metaprogramming, it’s not too bad.

I tried the custom pool route to do this, and this way is much simpler IMO.