Hello,
I’m working on an efficient way to represent user activity within an organization for sales analytics (emphasis on efficient). All sales want is the ability to see if a specific user in an org was doing a specific action on a given day and the percentage of users performing some action.
I was thinking of representing the information using a bit-vector where each user is a specific bit position. For example:
1> add_activity = fn(id, activity) -> Bitwise.bor(activity, Bitwise.bsl(1, id)) end
#Function<41.105768164/2 in :erl_eval.expr/6>
2> activity = add_activity.(10, 0)
1024
3> activity = add_activity.(17, activity)
132096
4> Integer.to_string(activity, 2)
"100000010000000000"
5> user_active? = fn(id, activity) -> Bitwise.bor(Bitwise.bsl(1, id), activity) == activity end
#Function<41.105768164/2 in :erl_eval.expr/6>
6> user_active?.(10, activity)
true
7> user_active?.(18, activity)
false
Big numbers for sure, but also pretty efficient.
Now I’d like to save this data, where a schema may look like:
schema "user_activity_metrics" do
field :entity_id, Types.UUID, primary_key: true, default: Types.UUID.null()
field :activity_date, :date, primary_key: true
field :uploads, ## Type???
field :downloads, ## Type???
field :emails, ## Type???
end
I’d love your input on how to represent bit-vectors Type
in Postgres that is Ecto
friendly and efficient. Efficient meaning in terms of space, but also being able to make simple queries. I know Postgresql supports types bit
and bit varying
, but last time I looked you couldn’t do bitwise searching if it was larger than a word IRC.
I could use {:array, :boolean}
, but not too sure on space.
Suggestions welcome