Ecto preload for tag has_many tags

I have the following schema:

defmodule Tag do
  schema "tags" do
    has_many :tags, __MODULE__
    belongs_to :tag, __MODULE__
  end
end

If I want all of the relationships preloaded for serialization how would I ensure that? Doing

preload([:tag, :tags])

leaves me with Ecto.Association.NotLoaded. So doing:

preload([tag: [:tag, :tags], tags: [:tag, :tags]])

finds me again in a position with Ecto.Association.NotLoaded after the depth N + 1. How do I ensure all associations are loaded for this type of relationship?

1 Like

I don’t think that is possible. What would happen if you have a circular dependency?

A -> B -> C -> A

2 Likes

The problem with a structure like this (a tree-like structure encoded as a directed graph) is that expressing a precise query for this is not possible in standard SQL.

Two solutions available in standard SQL in this situation are:

  • nested, loopy queries until everything that needs to be loaded, is loaded - this is bad because we’ll do god knows how many queries;
  • load all the tags and resolve dependencies in memory - this is bad because we’re going to load a lot of unnecessary data.

That’s exactly why alternative encodings for tree-like structures in SQL exist - with Nested Set mode being probably the most popular one. Unfortunately, they also have some issues - e.g. nested set model is inefficient during writes since it requires updating a lot of records.

Postgres gives us one more solution to this problem - recursive queries as described here or here. This technique is used in the arbor package, but it has some architectural choices I don’t agree with (most notably, defining functions inside using).

4 Likes

Another option is a closure table, but it requires two tables and the use of insert/update/delete triggers. Here’s a link: https://gist.github.com/desfrenes/733a83ef82b03ee701caa761951767c9 (not my work)

I think it is a good compromise between the different methods.

1 Like