How to store multi-referencing user activity logs in one table?

databases

#1

Hello, I am having hardtimes to create a table schema for logging user activities.

the nearest example is right here, elixir forum’s notification:
33%20PM
In this picture, i can list all my activity logs in one place but it references different types of data.

my usecase example is like this:

  • you’ve liked Post: Good Morning -> link to that specific post
  • Selena started to follow you -> link to that user’s profile
  • new video clip Video: Apple Tricks from Channel: Fun & Fun -> link to that video & link to that channel
  • Sally commented Comment: I LOVE YOU!! on your Post: Hi There -> link to that user’s profile & link to that comment (or post)

My first approach is like this:

table: user_activities
id / activity_type / user_id(me) / user_id(others) / post_id / video_id / comment_id / channel_id …

I think i can achieve this by filling activity_type field and two referecning fields.
but also have many worries about having a lot of null fields as data_type grows.

any advice for this kind of concern?


#2

Seems reasonable to me.

The approach you’ve suggested is the one recommended by the Elixir Ecto docs., if that gives you more confidence :slight_smile: https://hexdocs.pm/ecto/Ecto.Schema.html#belongs_to/3-polymorphic-associations


#3

Just some general observations.

You seem to be arbitrarily “fixing” constraints because you’re rushing towards the finish line - by making arbitrary decisions you are trying to artificially narrow the decision space in order to eliminate potentially confusing options (i.e. “I don’t know where to start”).

So for example you simply decide that your “user activity log” has to be in a single table. Why? I’m not saying they shouldn’t be in a single table but it may be worth exploring how you came to that conclusion.

the nearest example is right here, elixir forum’s notification

So right there we’re looking at just one possible representation of your information. Ideally we would like to get this type of information in a single query and even that isn’t a hard and fast rule as there are other ways to aggregate information - but there really is no constraint here to keep it in a single table (but there may be other reasons why that might be a good idea).

my use-case example is like this:

Rather than heading straight for the database, start defining and categorizing the “bits and pieces” in terms of your problem space. Your examples seem to suggest:

  • “Liking another post” notification
  • (implying) “My post being liked” notification
  • “Being followed” notification
  • “New video” notification
  • “My post being commented on” notification
  • etc.

Then start looking at the various “bits and pieces” and start asking yourself some questions:

  • What do some of the pieces have in common
  • What are the actual differences between the pieces
  • Is there a way I can factor out the differences, unifying the containing commonalities
  • Is there a way I can extract out the contained commonalities, making the different containing types much clearer.

(The notion of exploring commonalities/variabilities was formalized as Commonality/Variability Analysis for multi-paradigm design and OO by James O. Coplien (1998) (who uses it now in Data, Context, and Interaction) and picked up by Alan Shalloway (2004). But commonality isn’t necessarily always expressed in inheritance heirarchies but can also be exploited through composition and containment.)

My first approach is like this:

Realize that there will be lots of times where your first attempt will not be quite right, possibly even totally wrong. Just be committed to making the necessary changes when the “pain points” of your current solution start to become obvious.

The trick is to find the “cheapest” way to exploring your particular problem space to find its “natural constraints” around which you want to build your application. So exploit low effort (non-coding) explorations whenever you can - which doesn’t mean that you should avoid spikes or prototypes, just know what they cost (and don’t get attached).


#4

thanks, the keyword polymorphic is the one i was looking for !
i’ve also find an example in the newly added section in Programming Ecto.

happy coding :slight_smile:


#5

I find that in my work lately what I do is I mix models a bit. I work primarily with PostgreSQL, which has good support for both relational data structures and document store data structures.

In the case not unlike that which you’re talking about, where I have a set of well defined values (eg. activity_type, user_id) and a set of values that are contextually defined (for example by activity_type), I’ll have have a JSON field which encompasses the options that vary across types. So a video link may be needed for for a video post, but a different set of data may need an array of user_id values…

This isn’t a silver bullet by any means, and I’ve only a few early efforts at how to best mix relational structured data and document store “unstructured” data. There are very much trade-offs and there are gating factors which would qualify/preclude this approach, but I do think there’s some value in having it in the toolbox.