Selecting fields in preloaded joins Ecto

alaister · July 1, 2016, 11:03pm

Hi,
I have the following ecto query:
post_query = from p in Post, where: p.id == ^id, join: c in assoc(p, :comments), join: u in assoc(c, :user), preload: [comments: {c, :user}]

How do I select the following fields: p.title, p.content, p.comments, c.comment, c.inserted_at and u.name?
I’m having trouble with this because of the wired joins I’m doing (Still learning). Also this query actually runs two queries on the database. Is there any way to do it all in one?

Thanks,
Alaister

Molly101 · July 1, 2016, 11:59pm

You can use a pattern matching here to get only those fields You’re interested in, i.e. %{title: title, content:content, comments: comments, user: %{name: name}} = Repo.get!(post_query) and then You can pass comments though Enum.Map. In Phoenix app You can also use View - “show.json” approach to convert result of Your ECTO query to a Map with the fields that You need… Because anyway You can’t just encode ECTO query result to JSON…

Linuus · July 2, 2016, 9:16am

This should make only one DB query.

post_query = 
  from p in Post,
    where: p.id == ^id,
    join: c in assoc(p, :comments),
    join: u in assoc(c, :user), 
    preload: [comments: {c, user: u}]

Regarding selecting association fields, you can do something like this (not tested)

post_query = 
  from p in Post,
    where: p.id == ^id,
    join: c in assoc(p, :comments),
    join: u in assoc(c, :user), 
    select: %{id: p.id, title: p.title, comments: p.comments, name: u.name}

Not sure exactly what you want to get back but that’s one way to select fields

alaister · July 2, 2016, 6:52pm

Thank you @Linuus! I was missing the user: u on the end there. Still not entirely sure what that is doing?

Also the data structure I’m trying to achieve looks like the following:

%{
  content: "This is some content",
  id: "46b7e047-a1f1-4fe1-b7f1-91c2aeb5910d",
  inserted_at: #Ecto.DateTime<2016-06-29 11:49:53>,
  title: "This is a new post"
  comments: [%{
    comment: "Some comment!",
    id: "a94fc738-69e0-44e1-9e39-3204ad7f3878",
    inserted_at: #Ecto.DateTime<2016-07-01 22:48:41>,
    user: %{
      name: "Alaister Young",
    },
    %{
      comment: "Some comment!",
      id: "a94fc738-69e0-44e1-9e39-3204ad7f3879",
      inserted_at: #Ecto.DateTime<2016-07-01 22:48:42>,
      user: %{
        name: "Alaister Young",
      }
    }
  ]
}

The problem I’m having with this structure is I’m not sure how to deal with the list of comments with the nested users.

Thanks again!

Linuus · July 3, 2016, 5:32pm

The user: u tells ecto to use that join thingy for the preload and not issue another request to fetch that association. (I think… I’m quite new to Ecto myself.)

So this query:

post_query = 
  from p in Post,
    where: p.id == ^id,
    join: c in assoc(p, :comments),
    join: u in assoc(c, :user), 
    preload: [comments: {c, user: u}]

returns that structure, right? So what you want to do now is limit the number of selected fields for performance reasons?

alaister · July 3, 2016, 6:20pm

That is completely correct!

Linuus · July 3, 2016, 6:58pm

I had a discussion about this on IRC and it seems to often be better to not join manually and just preload the data (and let it do multiple queries).

This would issue one request to the DB:

post_query = 
  from p in Post,
    where: p.id == ^id,
    join: c in assoc(p, :comments),
    join: u in assoc(c, :user), 
    preload: [comments: {c, user: u}]

But it would return a lot of excessive data because the join would cause the Post data to be sent for each comment (due to how SQL works). Ecto hides this though by throwing away the excessive data.
You can se this if you run something like the query I wrote before:

post_query = 
  from p in Post,
    where: p.id == ^id,
    join: c in assoc(p, :comments),
    select: %{id: p.id, comments: c}

It will return one struct for each Comment so if a Post has two comments:

[%{
  id: 1,
  comments: %{id: 123, text: "foo"}
}, %{
  id: 1, 
  comments: %{id: 124, text: "Next comment"}
}]`

You don’t see this behaviour when running without the select statement because Ecto takes care of mapping the data.

When just preloading:

post_query = 
  from p in Post,
    where: p.id == ^id,
    preload: [comments: [:user]]

This would issue three requests but no excessive data so in many cases it is actually more efficient (Ecto even does the requests in parallell when possible).
For example, if the same user has made all the comments, the user would only be fetched once. When joining, the user would be returned for each comment.

So, unless you know you have a bottleneck here it may not be worth it…

(I’m quite new to Ecto, so if someone sees something wrong here, please let me know! )

alaister · July 3, 2016, 8:27pm

Ahh okay that makes a lot of sense! Thanks again

My only question now is a much more general Ecto question and that’s how to order the comments in descending order. I understand that it has something to do with order_by: [desc: c.inserted_at] but I’m not sure where to put this!

Thanks,
Alaister

alaister · July 3, 2016, 8:49pm

I think I have solved this one:

comment_query = from c in Comment,
                order_by: [desc: c.inserted_at],
                preload: :user

post_query = from p in Post,
             where: p.id == ^id,
             preload: [comments: ^comment_query]

post = Repo.one(post_query)

Also would selecting only the fields I needed make my queries more performant? If so how might I do it with the preloaded users without the join?

Thanks,
Alaister

shahryarjb · May 29, 2019, 3:34pm

Hello, I have a problem when I want to use select in preload, please see this code

query = from u in UserSchema,
       where: u.id == ^id,
       left_join: c in assoc(u, :subscribers),
       preload: [subscribers: c],
       select: %{
         id: u.id,
         name: u.name,
         mobile: u.mobile,
         last_name: u.last_name,
         mobile: u.mobile,
         status: u.status,
         role: u.role,
         inserted_at: u.inserted_at,
         updated_at: u.updated_at,
         subscribers: c
       }
   Repo.one(query)
  end

and error:

** (Ecto.QueryError) the binding used in `from` must be selected in `select` when using `preload` in query:

I should use select on UserSchema, how can I fix this ?

benwilson512 · May 29, 2019, 3:49pm

query = from u in UserSchema,
       where: u.id == ^id,
       left_join: c in assoc(u, :subscribers),
       preload: [subscribers: c]

   Repo.one(query)

shahryarjb · May 29, 2019, 4:01pm

if I use this, it shows me user password I need to select fields I want not all of them

benwilson512 · May 29, 2019, 4:03pm

When you say the user password, do you mean the user password hash or the user’s actual password? The user’s actual password should never, EVER be stored in the database.

shahryarjb · May 29, 2019, 4:09pm

I mean the user hash password and shows the user status, I want to show my user the query selected.
if I call the u: u in my select this problem will be solved like this

   query = from u in UserSchema,
       where: u.id == ^id,
       left_join: c in assoc(u, :subscribers),
       preload: [subscribers: c],
       select: %{
         u: u,
         user_info: %{
           id: u.id,
           name: u.name,
           mobile: u.mobile,
           last_name: u.last_name,
           mobile: u.mobile,
           inserted_at: u.inserted_at,
           updated_at: u.updated_at,
         }
       }

   Repo.one(query)

but it shows me Users all fields again

benwilson512 · May 29, 2019, 4:13pm

query = from u in UserSchema,
       where: u.id == ^id,
       left_join: c in assoc(u, :subscribers),
       preload: [subscribers: c],
       select: struct(u, [
         :id,
         :name,
         :mobile,
         :last_name,
         :mobile,
         :status,
         :role,
         :inserted_at,
         :updated_at,
       ])

shahryarjb · May 29, 2019, 4:15pm

** (Ecto.QueryError) the binding used in `from` must be selected in `select` when using `preload` in query:

from u0 in BankError.ClientApi.Auth.UserSchema,
  left_join: s1 in BankError.ClientApi.Factor.SubscriberSchema,
  on: s1.user_id == u0.id,
  where: u0.id == ^...,
  select: struct(s1, [:id, :name, :mobile, :last_name, :mobile, :status, :role, :inserted_at, :updated_at, :subscribers]),
  preload: [subscribers: s1]

    (ecto) lib/ecto/repo/queryable.ex:132: Ecto.Repo.Queryable.execute/4
    (ecto) lib/ecto/repo/queryable.ex:18: Ecto.Repo.Queryable.all/3
    (ecto) lib/ecto/repo/queryable.ex:66: Ecto.Repo.Queryable.one/3

benwilson512 · May 29, 2019, 4:46pm

oops, I had a typo, fixed!

shahryarjb · May 29, 2019, 5:17pm

I have this error,

** (Postgrex.Error) ERROR 42703 (undefined_column) column u0.subscribers does not exist

    query: SELECT u0."id", u0."name", u0."mobile", u0."last_name", u0."mobile", u0."status", u0."role", u0."inserted_at", u0."updated_at", u0."subscribers", s1."id", s1."expire_time", s1."status", s1."user_id", s1."inserted_at", s1."updated_at" FROM "users" AS u0 LEFT OUTER JOIN "subscribers" AS s1 ON s1."user_id" = u0."id" WHERE (u0."id" = $1)
    (ecto_sql) lib/ecto/adapters/sql.ex:605: Ecto.Adapters.SQL.raise_sql_call_error/1
    (ecto_sql) lib/ecto/adapters/sql.ex:538: Ecto.Adapters.SQL.execute/5
    (ecto) lib/ecto/repo/queryable.ex:147: Ecto.Repo.Queryable.execute/4
    (ecto) lib/ecto/repo/queryable.ex:18: Ecto.Repo.Queryable.all/3
    (ecto) lib/ecto/repo/queryable.ex:66: Ecto.Repo.Queryable.one/3

it should be noted I fixed my problem with long way but if it works I think my code will be Shorter

benwilson512 · May 29, 2019, 5:20pm

One more typo, try again.