Trouble doing subqueries with Ecto

I have a Blog app with a Newsfeed. It works like this:

  • Users can submit Posts.
  • Users can follow other Users.
  • When a user submits a Post, an item in the Newsfeed table is added.
  • Users can see a newsfeed of Posts that have been submitted by Users they are following.

I want to use Ecto.Query to get a list of newsfeed items from users that a given user is following.


Quick background. Here are the objects:

User

mix phx.gen.json Accounts User users email:string password:string

Post

mix phx.gen.json Content Post posts title:string content:string user_id:references:users

(users and posts have a has_many: and belongs_to: relationship.)

Follow

mix phx.gen.json Accounts Follow follows following_id:references:users followed_id:references:users

(When user A follows user B, a new Follow entry is created where following_id points to A, and followed_id points to B.)

Newsfeed

mix phx.gen.json Content Newsfeeditem newsfeeditems type:string, user_id:integer, content:string

Also, I am using Ecto v2.2.6, Phoenix 1.3


Now I want to query this stuff. For me to get a list of Newsfeeditems for a given user, it’s easy:

import Ecto.Query

query =
  from n in Newsfeeditem,
    where: n.user_id == ^user_id

Let’s say I’m user 1, and I am following users 2, 3 and 4. There are three entries in the follows table. To get all the corresponding newsfeeditems for those users, the query would look something like this:

query =
  from n in Newsfeeditem,
    where: n.user1_id in [2,3,4]

I want to make it dynamic. Here’s where I am lost. I would like to do something that resembles this:

subquery =
  from f in Follow,
    where: f.following_id == 1,
    select: f.follower_id

query =
  from n in Newsfeeditem,
    where: n.user_id in (Repo.all(subquery))

Obviously this doesn’t work, but I’m not sure how to structure these things correctly.

How can I select this via subquery? (Note I’m looking for the subquery solution specifically, but bonus points if there is a better way)

If using PostgreSQL I’d use an array aggregation function (as I recall it is literally called array_agg) then use the ‘test if in array’ call to test if the id is in that array aggregation. Though honestly in this case 2 query calls is not too bad unless they have a truly monstrous amount of callers (in which case the subquery would be better as a materialized view, perhaps split on some hash, but that is only if it does get truly monstrous in size, so not yet ^.^). :slight_smile:

You should be able to just join across them though I’d think?

1 Like

Yeah, I’m using joins to pull this off, but I want to know what the other ways to do it are :slight_smile:

This is what the join looks like:

query =
  from f in Follow,
    where: f.following_id == 1,
    join: n in Newsfeeditem,
    on: n.user_id == f.follower_id,
    select: n

Well any join can become a subquery, just slower that way. ^.^
Try for joins when at all possible, you can always refine them the majority of the time in their ON if needed.