Preloading deeply nested datasets for object update

I have a data model that is deeply nested (3 - 5 levels deep).

Using ecto, I have the schema setup so that the associations are established amongst each of the relevant entities.

I have a query where I want to update this object. Normally I’d load the object, push it through a changeset, and then send the update. In this case, because the parent is so nested, I’d prefer the parent query to be ignorant of the nested tables within the children.

Is there a way to tell preloaded objects to preload all of their fields? The only examples I can find are where the parent explicitly declares all child preload items - instead, I’d prefer the children to know how to load themselves, and report that back to the parent. Is this a possibility?

Sample:


program_changeset =
      Program
      |> Repo.get(program_id)
      |> preload_all_program()
      |> IO.inspect(label: "Program Fetched")
      |> Program.changeset(attrs)

defp preload_all_program(p) do
    keys = [:events, :fields, :phases, :probes, :prompts, :target_groups]
    Enum.reduce(keys, p, fn k, acc -> acc |> Repo.preload(k) end)
  end

I believe the above is a shallow load for each of the items. Is there an idiomatic way to load this all deep?

defmodule MyProject.Repo do

  def preload_all(%_{} = structure) do
    to_preload = for {key, %Ecto.Association.NotLoaded{}} <- structure, do: key
    preload(structure, to_preload)
  end

end

You can use the preload query binding to create a preloaded query that you then use with Repo.one.

nested_assocs = # your nested structure
Repo.one(from p in Program, where: p.id == ^program_id, preload: ^nested_assocs)

The nested structure is a keyword list where you can specify all the levels of nesting you require. e.g. [events: [child_1: [:sub_child2]], phases: [:other_nesting]] if your event has a nested child_1, which itself has a nested sub_child2, etc.

From the documentation:

# Returns all posts, their associated comments, and the associated
# likes for those comments.
from(p in Post,
  preload: [comments: :likes],
  select: p
)

Anyway, if you’re having belongs_to and has_many in single query, you won’t be able to load every level, since it is clearly a loop.

For example with

schema "posts" do
  belongs_to User
end

# And
schema "users" do
  has_many Posts
end

You won’t be able to load in the complete depth automatically, since users will load posts and post will load it’s user and so and so on.

That’s why you need to define fields in the code, and that’s why there is no way to automatically preload everything no matter how deep it is.

The real question is, why do you need to preload everything? From your explanation, it is not clear why do you need to preload everything to just perform an update

1 Like

That sounds like the XY problem, can you tell us what is your desired result? What you describe might not be the way to achieve it.

1 Like

Nitpick: Repo.preload accepts a list, which does the same thing as this Enum.reduce (plus concurrent loading in some cases).


Back to your question: where are the deeply-nested attrs to go into this mega-Changeset coming from?

I think this development of a nested key structure is probably the route I should take.

Context for my use case is that I am implementing a graphql API. There is a graphql mutation which is passing deeply nested form state. I need to load the pieces of the mutation that are in the update, so that I can run the object through a changeset and save.

People do that with Enum.reduce just fine, yeah, so feel free to go for it. Though I have to specify that this also includes child associations as well.

I’m trying to implement like this:

 def build_keys() do
    [
      :prompts,
      :events,
      target_groups: TargetGroup.build_keys(),
      phases: Phase.build_keys(),
      probes: Probe.build_keys(),
    ]
  end

Each of the build_keys functions returns a single key like so: :key

I’m getting an error:

`preloaded_keys` is not a valid preload expression. 
preload expects an atom, a list of atoms or a keyword list with more preloads as values.

What am I doing wrong? How do I differentiate a level which is one deep versus many deep?

Is that the full error message?

If it also says Use ^ on the outermost preload to interpolate a value then you should interpolate the preloaded_keys variable, like Repo.one(..., preload: ^preloaded_keys)

Please share the code and full error message, that will help to debug.

Otherwise try the other suggestions to reduce, but I’m in favour of the preload query expression because ecto should only use the connection pool once, instead of getting a new connection for every Repo.preload command.

The ^ fixed it - thanks!

1 Like