Ecto Preloading Associations - Why two querys and not just one by default?

I am trying to learn the basics of Ecto from Elixir School and I see in their examples that using this syntax Repo.all(from m in Movie, preload: [:actors]) leads to 2 queries being done to the database while using Repo.all(from(m in Movie, join: a in assoc(m, :actors), preload: [actors: a])) leads to only one query being made to the database, and my question is why does not the first syntax only do 1 query to the database?

Is there another way to fetch the associations in only one database call without the need to be such verbose as in the last syntax?

Example:
I would like to be able query with something like Repo.all_with_associations(Movie) or similar, and have only one dtabase call being made to fetch everything…

IEX

First Syntax - Two database calls

iex(42)> Repo.all(from m in Movie, preload: [:actors]
...(42)> )

10:26:40.102 [debug] QUERY OK source="movies" db=0.8ms queue=0.1ms
SELECT m0."id", m0."title", m0."tagline" FROM "movies" AS m0 []

10:26:40.130 [debug] QUERY OK source="actors" db=4.5ms queue=0.1ms
SELECT a0."id", a0."name", m1."id" FROM "actors" AS a0 INNER JOIN "movies" AS m1 ON m1."id" = ANY($1) INNER JOIN "movies_actors" AS m2 ON m2."movie_id" = m1."id" WHERE (m2."actor_id" = a0."id") ORDER BY m1."id" [[1]]
[
  %Example.Movie{
    __meta__: #Ecto.Schema.Metadata<:loaded, "movies">,
    actors: [
      %Example.Actor{
        __meta__: #Ecto.Schema.Metadata<:loaded, "actors">,
        id: 1,
        movies: #Ecto.Association.NotLoaded<association :movies is not loaded>,
        name: "Tyler Sheridan"
      },
      %Example.Actor{
        __meta__: #Ecto.Schema.Metadata<:loaded, "actors">,
        id: 2,
        movies: #Ecto.Association.NotLoaded<association :movies is not loaded>,
        name: "Gary"
      }
    ],
    characters: #Ecto.Association.NotLoaded<association :characters is not loaded>,
    distributor: #Ecto.Association.NotLoaded<association :distributor is not loaded>,
    id: 1,
    tagline: "Something about video games",
    title: "Ready Player One"
  }
]

Second Syntax - One Database call

iex(43)> query = from(m in Movie, join: a in assoc(m, :actors), preload: [actors: a])
#Ecto.Query<from m in Example.Movie, join: a in assoc(m, :actors),
 preload: [actors: a]>
iex(44)> Repo.all(query)

10:28:27.329 [debug] QUERY OK source="movies" db=4.8ms decode=0.1ms queue=0.1ms
SELECT m0."id", m0."title", m0."tagline", a1."id", a1."name" FROM "movies" AS m0 INNER JOIN "movies_actors" AS m2 ON m2."movie_id" = m0."id" INNER JOIN "actors" AS a1 ON m2."actor_id" = a1."id" []
[
  %Example.Movie{
    __meta__: #Ecto.Schema.Metadata<:loaded, "movies">,
    actors: [
      %Example.Actor{
        __meta__: #Ecto.Schema.Metadata<:loaded, "actors">,
        id: 1,
        movies: #Ecto.Association.NotLoaded<association :movies is not loaded>,
        name: "Tyler Sheridan"
      },
      %Example.Actor{
        __meta__: #Ecto.Schema.Metadata<:loaded, "actors">,
        id: 2,
        movies: #Ecto.Association.NotLoaded<association :movies is not loaded>,
        name: "Gary"
      }
    ],
    characters: #Ecto.Association.NotLoaded<association :characters is not loaded>,
    distributor: #Ecto.Association.NotLoaded<association :distributor is not loaded>,
    id: 1,
    tagline: "Something about video games",
    title: "Ready Player One"
  }
]

It’s not being verbose it’s about being explicit. Ecto will only provide what you specifically ask for by design. I think it’s a great design choice and you always know what the query will do. As far as a second query vs join, if performance is an issue you could test writing the query with a join.

2 Likes

I love explicit code, but not necessarily verbose code… I mean I don’t care how is being done, I am happy that the semantics are explicit enough for me to know what will happen. So I think that having this method name or a similar one Repo.all_preload(Movie) will be explicit without the need to expose lower level details.

This is to much verbose and my brain needs to parse it each time I found it in the code… I would prefer just to have to use it in advanced use cases.

Not having an high level approach has of my suggestion, this would be explicit enough without being to much verbose… But I am curious to know why this syntax does not perform only one query.

It cannot perform the preloading in one query automatically, because it might change the results especially if you’re using offsets or limits and one-to-many associations, which join more rows than your primary selected table has.

4 Likes

Also in your “Second Syntax - One Database call” section another limitation is that if each movie has ten actors in it, you are returning a row for each movie/actor pair.

i.e.:

  1. Movie 1, Actor 1
  2. Movie 1, Actor 2
  3. Movie 1, Actor 3

and this includes all the field of the movie each time which is a lot of duplicated data over the wire. There might be a way around this but I do not currently know of it.

1 Like

It can be a whole lot of duplicate data, and the way around it is indeed 2 queries :slight_smile:

2 Likes