Ecto (PostgreSQL) type recommendation

Hi,

So I’m looking at a project that involves setting special activity flags for users. I won’t go into the reasons why, bit suffice to say that efficiency is pretty important.

Now I have done something similar before where I created a bit-vector. For example:

add_activity = fn(id, activity) -> Bitwise.bor(activity, Bitwise.bsl(1, id)) end
id_active? = fn(id, activity) -> Bitwise.bor(Bitwise.bsl(1, id), activity) == activity end

The numbers do get quite large, but it’s still more efficient than lists or strings. The ability to do bitwise operations really helps performance when searching and updating.

However, I’d like to be able to save the data in PostgreSQL with ecto.

There will be a row per organization per day, but I only need to update the fact a user was active on a given day (so idempotent). It would be nice if we can keep the same efficiency here too (for inserts, updates and search).

Search: It would be nice to be able to say ‘get rows and position numbers that have bit set to 1’.
Update: It would be nice to be ably to apply a bitwise or to update an existing row.

I looked at the following Postgres types:

  • number: efficient, but no real way to do queries
  • bit / var bit: In theory awesome. I can probably write a fragment to do nice queries using bitwise operations. But, Postgres can’t seem to handle large numbers here. I need to do ugly chunking operations to, for example, cast from a numeric type to bit or to do boolean operations (it seems like 64 bits are some kind of internal limit).
  • bytea: possible, although searching could be a pain.
  • array: possibly the best in terms of updates and search, although casting could be awkward. Also, may not be that memory efficient ({:array, :boolean} maybe).

I’m leaning towards an array type (an array of booleans). I would, however, like casting to be efficient as well as memory efficient too.

Anyone have any recommendations?

Nothing immediately jumps to mind except maybe support your {:array, boolean} idea, combined with a library a la ExBin — ExBin v0.4.0.

Also that library supports chunking so maybe it won’t be so awkward.

Have you thought about using a custom ecto type? seems the perfect use-case to me.

Yes to both of you. Although it would be nice to be able to do:

update Metrics
             set user_activity[2] = 1
             where org_name='Bob's fish';

Wrap a few of those in fragments / custom macros:

https://hexdocs.pm/ecto/3.11.1/Ecto.Query.API.html#fragment/1-defining-custom-functions-using-macros-and-fragment

1 Like

Yeah…this is the problem I run into with bit. Same applies to bit shift operations.

Not sure I see the exact issue. Casting numbers to bit is documented on the linked page:

In addition, it is possible to cast integral values to and from type bit. Some examples:

44::bit(10) 0000101100
44::bit(3) 100
cast(-44 as bit(12)) 111111010100
‘1110’::bit(4)::integer 14

The problem is it doesn’t work with NUMERICS

=# select cast(2628571181105336077494795701369512280333343682331285668112708328669651138588157877092084196316261329556113852143036127852982354048364156779817408440315960 as bit(1000));
ERROR:  cannot cast type numeric to bit
LINE 1: select cast(262857118110533607749479570136951228033334368233...
               ^

An array of booleans seems to be fine.