Where do you preload associations?

It’s not and that’s a good thing. Ecto by default does one query per association (in parallel), transfering the least amout of data from the db to elixir. Doing less queries will “do less queries” but will also result in more duplicated data transfered (joins duplicate data most cases). It’s not a guarantee that doing that switch will actually be more performant. Therefore if you want things to be preloaded from joins ecto requires you to explicitly state that in the query.

6 Likes

As LostKobrakai, it’s actually a good feature that Ecto only does what you ask and performs no “guessing”.

You can avoid the extra query if you preload on a join, in this case Ecto is smart enough to know that comments has already been retrieved, because you’ve basically been explicit about what to get and how to get it.

      from([post: post] in query,
        left_join: comments in assoc(group, :comments),
        as: :comments,
        preload: [comments: comments]
      )

This has the side effect that you could actually filter the comments in query but still get them “preloaded” for example.

1 Like

I’d be curious: I have User has_many Enrollments. I’m loading Enrollments, but I need user.name with each of them.

Right now I do a preload, but that feels inefficient. Isn’t this a situation where a join would actually make more sense? It’s a join, but a single query.

How does one actually measure and compare these things, please?

You should run some benchmarks on your production setup to figure this out. This may depend on a lot of factors and put different loads on different parts of your infrastructure (app vs DB).

Feels like you’re right about this one, but without the actual numbers this is just guesswork.

Exactly. Loading the user name for each enrollment is probably optimal as a single join. However, loading a post body for each comment (same overall query structure) would be horribly inefficient as a join because the post body (a large string) would be copied dozens or even hundreds of times. Ecto has no a priori way of knowing the shape or size of your data. The default adds a fixed cost (latency) in order to avoid basically all cases of combinatorial explosion.

1 Like