Preloading top comments for posts in Ecto using window functions

Note: I am using MySQL.

I have two tables named Post, Comment

  schema "posts" do
    field: id, string
    field :title, :string
    field :slug, :string
    field :active, :boolean, default: true

  schema "comments" do
    field: id, string
    field: post_id, string
    field :title, :string
    field :body, :string
    field :email, :string
    field :meta, :map
    field :active, :boolean, default: false

I am using Ecto for the database queries in Elixir. Currently, I am having the issue of trying to get all the posts in the database, and the most recent comment for those posts, if available. E.g:

Post 1 ──── post1__last_comment_id 
Post 2 ──── post2__last_comment_id      

Post 3 ──── nil

I have tried join_left on ==, but that returns all the comments related to the post not just the most recent one.

I tried also grouping_by comment.post_id and get the max id, while that worked, that is not the desired solution as there is a requirement for it to be based on date rather than id of the comment.

One more thing I tried after some suggestions on StackOverflow is using a window function since I cannot use a lateral join in MySQL. I tried to write an Ecto query for that and ended up with this:

Repo.all from p in Post, as: :post,
           join: c in assoc(p, :comments),
           inner_join: latest in subquery(
            from c1 in Comment, 
			select: {c1.start_date, over(row_number(), :parent)},
			windows: [parent: [partition_by: c1.post_id]]	
           ), on: ==,
           preload: [comments: c]

Unfortunately, that did not work and I am receiving this MySQL error but I can’t understand how to fix it:

** (Ecto.SubQueryError) the following exception happened when compiling a subquery.

** (Ecto.QueryError) subquery/cte must select a source (t), a field (t.field) or a map, got: `{&0.start_date(), over(row_number(), :department)}` in query:

from c0 in MyProject.Comments.Comment,
  windows: [parent: [partition_by: [c0.post_id]]],
  select: {c0.start_date, over(row_number(), :parent)}

Is it possible to be done using Ecto?

The error message is complaining about this being a tuple; Ecto can’t use that for a subquery. Try a map instead: %{start_date: c1.start_date, ...}

1 Like

After doing this, I got the following error:

** (MyXQL.Error) (1064) (ER_PARSE_ERROR) You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '(PARTITION BY post_id ORDER BY start_date DESC) AS row_numberFROMcomment’ at line 1`

I looked up what could cause this error, and I believe it might have to do with MySQL version being 5.7 (not supporting window functions).

If that is the case what other ways I could solve this problem without updating the MySQL version.

If you can’t upgrade to a version that supports window functions, other options:

  • write a query (probably with subqueries somehow?) that emulates what window functions do. This will probably be complicated and/or slow - that’s why window functions are useful.

  • reshape your data to make the query cheap. If your system has the same kind of expected usage pattern (lots of reads, relatively few writes, few/no deletes) as posts + comments, you could add a belongs_to :last_comment, Comment to Post. The code that creates a new comment would also update the post’s last_comment_id column.