Is it "better" to crash when missing preloads or to always ensure preloads assocs?

BartOtten · May 19, 2025, 10:34am

Did option one once, did bite me so since I use option 2.

This one in particular became nasty as the implicit preload caused unknown query load as new ‘this should be loaded by default’ kept being added while some functions never used those (go track those to add the explicit override…).

The problem occurred when a function to fetch ‘categories’ caused performance problems due to (unused by the page) default preloads that increased over time. However, this was not to be seen in the codebase directly. You see a ‘fetch_categories’ which does not trigger any alarm bell.

Since I just keep it to the basic and write additional functions when needed.

get_posts
`get_posts_with_authors’

Auto complete will suggest all variants during development which is very convenient and xref/lsp can track their usage.

They all accept preload opt to support the one-off cases. Although I suggest to refactor them to a simple named function call once development stabilized.

If there are a lot of nesting once could build an explicit ‘get_posts_with_nested_data’. It suffers the same issues as default preload but the name makes it clear in the code base.

slouchpie · May 20, 2025, 11:14am

I would like to make the point that it is not reasonable to expect a function to never crash.

Even Kernel.+/2 can crash, e.g. 1 + nil, for the reason of “bad input”.

It is OK for a function to crash when given bad input.

We can “spec” functions to describe the necessary input.

BartOtten · May 20, 2025, 3:44pm

The whole BEAM is designed with unexpected crashes in mind. Even when your code is perfect, one log entry too many can crash the system (and a zillion other things you never thought of or are outside your scope)

It is OK for a function to crash when given bad input.

But I rather see a match error due to properly placed is_integer guards than an arithmetic error.

codyjroberts · May 21, 2025, 1:13am

Isn’t this just an argument for preloading automatically? No option like that exists. I wonder if those of you that prefer Option 2 come from Rails, because that’s basically the default behavior of ActiveRecord. In my experience it was a nightmare trying to trace down all of those queries. At least preload is greppable!

I obviously can’t speak for those who designed the Ecto API, but in my opinion that’s the point of making preloading manual. It’s essentially removing that footgun and making you think about when and where you should be loading your data.

I believe you can design around this problem. At least in my experience. I’m sure there’s exceptions.

I’ll give an example that’s similar to an app I work on. Ignore the odd fictional table names. I calculate points using dozens of tables all linked back to a single record. The data model is rather complex and deeply nested because the domain is complex. I could probably denormalize some things but that’s another topic. I have a function that looks something like this in one of my context modules.

  @spec points_for_entry(User.t(), String.t()) :: points()
  def points_for_entry(%User{id: user_id}, entry_id) do
    entry =
      Repo.one!(
        from(
          e in Entry,
          where: e.id == ^entry_id,
          where: e.user_id = ^user_id,
          preload: [
            game_powerups: [powerups: [:tokens]],
            match: [:ruleset, :timespan, league: [:league_modifiers],
            units: [:group, :position, items: [:item_modifiers],
            champion_slots: ^ChampionSlots.with_champion_type_ordered()
          ]
        )
      )

    Points.calculate(entry)
  end

Points.calculate/1 expects an %Entry{} struct with all of those associations but doesn’t preload a single one. There’s dozens of functions in there with highly specific calculation functions that match on those associations in the function header. The Points module, like my client (liveview), doesn’t know anything about Repo. I get some benefits from this:

I can freely test Points.calculate/1 without ever touching the database. The module is strictly pure functions. It’s predictable.
My client never needs to know what to preload to get points. I call points_for_entry/2 in a few areas.
All my database side effects stay in the context modules for easy tracing.
I have to validate the resource owner anyway.
I can easily use Points.calculate/1 for a stream of entries.

I think in many cases you can bubble up your queries to a single caller like this. The clean separation of concerns between your data layer and your business logic has real benefits.

mindreader · May 21, 2025, 2:38am

I do agree with a lot of that. I do also like clean functions that don’t read data if possible.

That said I ran into many difficult situations where maybe code would need to check whether a feature was enabled, but only sometimes. Or filter out things the user had no permission to see, but only most of the time.

Sometimes we would end up with pre-loads six levels deep and it got confusing at that top level to know that five functions away, a feature check would sometimes happen that necessitated some data and if it wasn’t there, you’d just crash. There were a lot of times where I suddenly realized half the preloads weren’t needed due anymore to a removed function call in another file.

While it was annoying to sometimes have an innocuous code change cause an n+1 it was rarely a problem severe enough to put off lunch for.

That said I never found a perfect solution I was completely happy with. I have had a lot of thoughts about it over the years and tried some things but nothing felt right.

Sorc96 · May 21, 2025, 7:44am

I think you’re actually arguing for option 2. Option 1 would be having the LiveView preload everything through the user (probably using a separate context function) and then pass the result to points_of_entry.

Coming from Rails, I have enough experience with preloading everything at the top level to prevent n+1 queries to know how difficult it is to track which preloads are needed where. There’s no such worry with Ecto, since will never lazy load anything.

Maybe the “correct” solution to this problem is actually mapping the data into a custom structure. Even if we perfectly preload everything, all the code that works with this deeply nested structure has detailed knowledge of the database structure. That’s guaranteed to cause problems sooner or later.

codyjroberts · May 22, 2025, 1:03am

Hmm, I see where you’re going with this but that’s not what I’m trying to say. It’s a difficult topic to discuss without concrete examples. Design is so particular to the full context of the problem. If I can find some time I’ll try to put something together that’s a bit more clear if I’m not making much sense here.

I’m definitely not arguing for designing public functions in context modules that expect preloaded records from the view layer. That’s true. If I were to rephrase the OP’s question it would be: “Do you preload defensively?” and I’m not arguing for that either.

I interpreted @mindreader’s comments as suggesting that a fetch once approach is onerous for deep schemas being used in business logic related computations and that any performance hit or issues with tracing the query are worth the guarantees that the data will be loaded. That’s what I would consider preloading defensively.

The main point I’m attempting to make is that where you preload matters and there are benefits beyond performance for avoiding a defensive preload. For me, this topic is mostly about boundaries and separating the pure from the impure.

So to go back to OP’s example. Would I argue for Option 2? Yes for impure public functions in your data layer, though not necessarily as written. You have to preload somewhere after all. I would argue for Option 1 elsewhere but refactor to utilize pattern matching, typespecs, and well defined apis to provide better guarantees around what data those functions expect. Boundary could help here as well.

An association with ecto is essentially a result/option type. But it’s easy to not think of it that way so it can be a footgun. It’s easy to just call chat.messages without thinking about handling the empty case as evidenced by those who claim to have been bitten by Option 1. So preload seems like the natural option.

I think types will ultimately help solve the problem of accidentally trying to use an unloaded assoc but not necessarily solve the impurity problem. Until then I believe we still have some tools to be defensive both at compile time and runtime to avoid crashing in our business logic layers without sacrificing purity.

I’m not sure I agree with that, though maybe just on the semantics of lazy loading. If by lazy loading you only mean through the map.key access syntax, sure. But you’re effectively lazy loading if you call preload in the body of an iterator. That is something you are more likely to do when preloading defensively. The only difference between Ecto and AR here is that with Ecto it’s explicit which does encourage you do the right thing. To take OP’s example further to illustrate:

defmodule Chats do
  def list_chats(%Channel(id: channel_id}) do
    Repo.all(from c in Chat, where: c.channel_id == channel_id)
  end

  def is_safe?(chat) do
    chat = Repo.preload(chat, :messages)
    Enum.all?(chat.messages, &safe_message?/1)
  end

  defp safe_message?(message) do
    %{metadata: metadata, content: content} = Repo.preload(:metadata)
    metadata.flagged == false && profanity_free?(content)
  end

  defp profanity_free?(content), do: String.includes?(content, "badword")
end

for %Chat{} = chat <- list_chats(channel), do: is_safe?(chat)

Rewritten without preload where I believe dialyzer should show a type check failure

defmodule Chats do
  def list_chats(%Channel{id: channel_id}) do
    Repo.all(from c in Chat, where: c.channel_id == ^channel_id)
  end
end

defmodule Chats.Validation do
  def is_safe?(%Chat{messages: [%Message{} | _] = messages}) do
    Enum.all?(chat.messages, &safe_message?/1)
  end

  defp safe_message?(%Message{content: content, metadata: %Metadata{flagged: flagged}}) do
    flagged == false && profanity_free?(content)
  end

  defp profanity_free?(content), do: String.includes?(content, "badword")
end

# I believe Dialyzer will fail here if your schema's are typed
for %Chat{} = chat <- list_chats(channel), do: Chats.Validation.is_safe?(chat)

Now it should be easier to test and reason about your validation bits without fear of a crash from an unloaded assoc.

In essence this is kind of what I’m on about, but I don’t think there needs to be at translation to a custom structure. The important part is choosing where you validate your input. Do you validate it at the edge or deep into your business logic? One can face similar difficulties with deeply nested result/option types. Only with preloading it’s worse given the side effects.

Boundaries might be relevant here but it’s been a very long time since I’ve watched it.

slouchpie · May 23, 2025, 11:31am

I should have asked “is it better to keep data-level operations and domain-level logic separate or to never crash?”.

I also think I should have provided a more complicated example in OP. I was trying to give an example of a function that captures domain logic, a function whose implementation a developer would read when they want to talk to a business person asking “how do we know when an entity is (x)?”.

Speaking only for myself and my own taste, after reading all of the above I am a little more convinced that it is better to not couple database operations to domain logic. It mixes application layer concerns and can lead to performance problems.

codyjroberts · May 24, 2025, 6:20pm

Thanks for kicking off the discussion. It’s an interesting one.

For what it’s worth I’m now much more in doubt of dialyzer’s capability of detecting whether a record has preloaded values. The first example I gave it was definitely working but on the latter less trivial one I couldn’t get it to detect a type issue. The latter I tried in a fresh project, the former I added to an existing one to quickly test it. I’m not sure if it’s environment related but I believe I set them up the same.

If that’s the case other strategies like testing and branching on the missing collection will be needed to avoid a crash. I like al2o3cr’s strategy of alerting in prod if you have zero tolerance for a mistake for the product you’re working with, but would make it a last resort based on a risk assessment.

cjbottaro · May 28, 2025, 10:03pm

I was suggesting that you never call do_something(foo, preload: true) outside of a REPL, i.e. it’s just a convenience for when you’re messing around in a REPL.

We try to separate data loading from business logic. That latter just assumes all your data has been properly loaded. It makes testing way easier too.

BartOtten · May 29, 2025, 9:53am

I suggest adding helpers via
IEx — IEx v1.12.3 as the ‘convenience’ is maybe to convenient for your future workers to not abuse.

So then you van have a ‘preload kwlist’ argument in your code and a default kwlist in your .iex.exs

REPL: fetch_posts(preload: fp_preloads)