Aggregate Booleans in Postgres with Ecto

I have a Phoenix SaaS app. Each account can have a subscription to different plans, and there are also add ons. For example, a basic plan comes with x amount of users but they can also add additional users.

To manage this I have a Plan table and a PlanSubscription table. So an account has one or many plan subscriptions.

The schema for plans is defined like

field :num_users, :integer
field :this_feature, :boolean
field :other_feature, :boolean
...

For the number of users, this is easy to aggregate with an ecto query like

query = from(ps in PlanSubscription,
                 join: p in Plan, on: p.id == ps.plan_id,
                 select: %{
                   num_users: sum(p.num_users)
                 },
                 where: ps.account_id == ^account_id)

However, what I also want to have is if one of the plans has an feature boolean true, and another doesn’t, to use the true value. I was hoping to use

max(p.this_feature)

But found that Postgres does not aggregate booleans this way.

Any suggestions for an Elixir/Ecto friendly way to handle this?

Would a boolean ‘or’ work? I.e., where: p.this_feature || p.other_feature

From a SQL point of view the safest way I’ve found to do this in multiple engines over the years would be a sum after turning the bool values into ints and then checking if you have more than 1. You can roll this into a single long winded case-sum-case/cast statement. It’s a bit crappy but it’s an approach that reliably works and is easily understood by others, rather than relying on quirks, and actually accurately represents what you’re trying to do.

2 Likes

This is what I was hoping to avoid, wanted to use the native boolean datatype. Coming from a MSSQL background integer datatypes are probably the best choice to maintain compatibility with other DBMS.

I think I will convert my booleans to integers and then use a max aggregate.

Rather than convert the datatype you’re storing, you could just do it on the “view” that you’re pulling (sorry I’ve written this off the cuff, I know its not valid for Postgres, but should probably work with MSSQL if you’re using bit to represent your bools);

CAST((case when SUM(CAST(myboolfield as int)) > 1 then cast 1 else 0 end) as bit)

I guess it’s up to you to decide whats better/worse. The mangling on the query, altering your stored data type


Alternatively you could do something with a trigger that updates on the parent plan when someones subscription changes. I dunno if thats suitable for your use case either shrug

1 Like

I think for me, I would rather just store as an int. But I appreciate this response.

The solution would be to use the bool_and or bool_or aggregate functions from postres:

query = from ps in PlanSubscription,
                 join: p in Plan, on: p.id == ps.plan_id,
                 grouo_by: ps.id,
                 select: %{
                   num_users: sum(p.num_users),
                   foo_feature: fragment("bool_or(?)", p.foo_feature)
                 },
                 where: ps.account_id == ^account_id
4 Likes

Thanks! I wasn’t aware of these functions in postgres.

Here is the docs page for those curious about Postgres aggregates:

https://www.postgresql.org/docs/9.5/static/functions-aggregate.html

1 Like