Hook function in Ecto.Schema to set virtual fields when querying a Schema struct from the data store

johmue · May 18, 2020, 10:26pm

In this talk on Ecto the speaker recommends to calculate persistent fields from virtual Ecto.Schema fields that come in from user input in a changeset.

In his example there is an Ecto.Schema like

defmodule MusicDb.Track
...
  schema "tracks" do
    field :title, :string
    field :duration, :integer
    field :duration_string, :string, virtual: true

    timestamps()
  end

The duration field holds the duration in seconds persistently in the database. The duration_string is used to hand in the duration time from user input as a string, For example duration_string comes in as "3:25" in the changeset which would be piped through a function that sets the time_in_seconds field to 205.

So far so good. But how do I go the way back? When I query a Track my duration_string field would be nil. Is there any way to install some hook function in the schema where the Track struct is piped through whenever I query a Track so that I can calculate and set duration_string?

hauleth · May 18, 2020, 10:48pm

There is list of possible solutions:

persist field
create view with value computed
create custom Ecto type
create function to transform one to another
create comparison function in DB
create custom type in DB

These are few of possibilities. Which one is the best for You depends on what you need.

mathieuprog · May 19, 2020, 5:51am

I had the same need and created a small utility function that I call after a call to Repo’s function, in the context.

some_query
|> Repo.one!()
|> fill_virtual_fields()

It’s not a hook though as you asked. I have to write it after each Repo call for entities that have virtual fields.

github.com

mathieuprog/virtual_fields_filler/blob/master/README.md

# Virtual Fields Filler

Fill the virtual fields for your Ecto structs and nested structs recursively.

In your Schema, add the `fill_virtual_fields/1` function:

```
defmodule MyApp.User do
  @behaviour VirtualFieldsFiller
  use Ecto.Schema
  alias __MODULE__

  schema "users" do
    field(:first_name, :string)
    field(:last_name, :string)
    field(:full_name, :string, virtual: true)
    timestamps(type: :utc_datetime)
  end

  def fill_virtual_fields(%User{} = user) do

This file has been truncated. show original

I’d be interested to know which possible solution you personally considered best for your example. Let me know:)

mathieuprog · May 20, 2020, 12:30am

I found this thread interesting on the matter:

An after_load callback hook was at some point implemented.
I do not understand though why this issue was not re-opened.

The solution I suggested fills all the virtual fields recursively after fetching an entity and its preloads through a fill_virtual_fields function, but the limitation is that you should not preload more entities thereafter, or that the caller does not forget to call the function again after preloading more entities. In my design, the caller should not need to fetch preloads after an initial fetch, but it’s something I cannot prevent if it does happen.

dimitarvp · May 20, 2020, 8:21am

Because this community does not like implicit / magic behaviour. We like things explicit.

Plus, this seems like a perfect candidate for a Phoenix context: make a function like Tracks.load which takes care of the duration shenanigans and you’re in a good place.

LostKobrakai · May 20, 2020, 8:25am

There’s even the option of using preload functions:

johmue · May 20, 2020, 7:31pm

Fair enough.

I defined a function in the schema that adds the virtual fields and now I am piping/mapping all the instances of the Schama struct to it when one of the Phoenix contexts queries/preloads them.

Thanks to all

dimitarvp · May 20, 2020, 10:00pm

Not sure we’re talking the same thing.

Usually people indeed do Repo.get(MySchema, 123) |> MyBusinessLogic.load_more_stuff() everywhere they need to load their schema.

What I’m saying is: do the above in one function and only use that function when you need to load an instance of your schema. Don’t use Repo directly.

That’s basically one of the intended use-cases of Phoenix contexts.

mathieuprog · May 21, 2020, 1:34am

Exactly what I’m doing. Was just mentioning that you can’t prevent a developer from preloading more data after the context’s function call. So it’s just something to be known, even if it’s obvious for you that there should be no call to Repo after a function call to the context.

However with a hook that Ecto would somehow provide, you can make sure that any fetched entity will have its derived fields filled. So it provides that guarantee.

dimitarvp · May 21, 2020, 2:20am

You can’t, but is your team so huge that this policy can’t be enforced by code reviews?

I still disagree on hooks. It’s the lesser evil to mandate only using contexts (and not Repo functions) compared to other team members getting confused about how exactly the return value of a plain Repo.get ends up with a lot of extra stuff preloaded inside it.

Implicit behaviour is the cheap item that ends up costing you more than the expensive one in the long term, because you have to repair and take care of the cheap item many times.

mathieuprog · May 21, 2020, 2:48am

True, probably better notice the mistake of someone not following set conventions and design.

Then some other issues:

Your context function, say get_user receives what to preload as an option (at least I know more than a few devs here who design their context function in that way– receiving what to preload as option). A caller wants to preload the articles of the user and their categories. Say you have a virtual field that you wish to fill for categories, it’s not so trivial to fill it as it is nested (+ context function doesn’t even know whether it will be asked to be preloaded or not). (That’s why I developed that small library mentioned above, which fills the virtual fields recursively for entities by calling pure schema functions that implement a behaviour).
Do the context functions really need to explicitly fill virtual fields, and know which schema has which virtual fields? I understand that we favor explicit vs implicit, but the virtual fields can be abstracted away at the schema level, so that for the context function, they really are just regular fields. (At least that will do for fields that can be derived in a simple, pure way, as mentioned in original post).

benwilson512 · May 21, 2020, 2:00pm

I’d consider looking into Boundary — boundary v0.10.1. It can allow you to enforce programmatically where you allow things like Repo calls.

This general problem is part of why I’d like to take some time to make Dataloader more ergonomic outside of Absinthe. Over fetching / under fetching is really quite a hard problem to tackle.