Feedback on designing a polymorphic association in Ecto

linusdm · November 20, 2023, 7:50am

I’d like to get some feedback on a design decision I made last week, that doesn’t feel great. It involves relationships between some Ecto schemas. These are the relevant parts the schemas:

defmodule Uitlening do
  use Ecto.Schema
  schema "uitleningen" do
    belongs_to :lezer, Lezer
    belongs_to :organisatie, Organisatie
  end

  def new(%Lezer{} = lezer), do: %__MODULE__{lezer: lezer}
  def new(%Organisatie{} = organisatie), do: %__MODULE__{organisatie: organisatie}

  def voor_lezer?(%__MODULE__{lezer_id: lezer_id}), do: lezer_id != nil
  def voor_organisatie?(%__MODULE__{} = uitlening), do: not voor_lezer?(uitlening)

  # get the "abstract" type for usage by the protocol (see later)
  def get_uitlener(%__MODULE__{lezer: %Lezer{} = lezer}), do: lezer
  def get_uitlener(%__MODULE__{organisatie: %Organisatie{} = organisatie}), do: organisatie
end

defmodule Lezer do
  use Ecto.Schema
  schema "lezers" do
    field :naam, :string
    field :voornaam, :string
    # some other fields here, but no reference to Uitlening required (yet)
  end
end

defmodule Organisatie do
  use Ecto.Schema
  schema "organisaties" do
    field :naam, :string
    # some other fields here, but no reference to Uitlening required (yet)
  end
end

The migration file looks something like this:

defmodule MyApp.Repo.Migrations.Uitleningen do
  use Ecto.Migration

  def change do
    create table(:uitleningen) do
      add :lezer_id, references(:lezers), null: true
      add :organisatie_id, references(:organisaties), null: true

      timestamps()
    end

    # check one of two references is not null (but not both "not null" or null at the same time)
    create constraint(:uitleningen, "check_lezer_org_exclusive_not_null",
             check:
               "(organisatie_id IS NOT NULL AND lezer_id IS NULL) OR (organisatie_id IS NULL AND lezer_id IS NOT NULL)"
           )
  end
end

So, a lot of code to express this simple idea: the central entity “Uitlening” can have exactly one reference to either “Lezer” or “Organisatie”. It’s kind of a polymorphic relation (although not the same kind of polymorphism as expressed in the ecto documentation, I think). The check constraint in the database gives me confidence that I don’t mess up with the rule that only one reference should be made. In addition, the constructor functions on the Uitlening module make it easier (but not impossible) to create a valid struct.

To actually work with the reference, I’ve defined a protocol that captures an interface that both referenced entities can adhere to, like this:

defprotocol Uitlener do
  @spec name(t) :: String.t()
  def name(value)

  @spec is_important?(t) :: boolean()
  def is_important?(value)
end

# the implementations, embedded in the original modules
defmodule Lezer do
  # ...
  
  def is_important?(lezer) do
    # some implementation details here
  end

  defimpl Uitlener do
    def naam(lezer), do: "#{lezer.voornaam} #{String.upcase(lezer.naam)}"
    def is_important?(lezer), do: @for.is_important?(lezer)
  end
end

defmodule Organisatie do
  # ...
  defimpl Uitlener do
    def naam(organisatie), do: organisatie.naam
    def is_important?(_organisatie), do: false
  end
end

The protocol is used mostly in heex templates, or place where I have to get a representation of the abstract type. The problem with this setup is that in order to get to the abstract struct I have to to this involved succession of calls, every time again:

<%= Uitlening.get_uitlener(uitlening) |> Uitlener.naam() %>

It would be a lot nicer if I could reference the Uitlener as a field, like this:

<%=  Uitlener.naam(uitlening.uitlener) %>

but I don’t think Ecto can “coerce” one of two references into a single field (and I don’t want to resort to doing that manually every time a struct comes back from a repository). Or are there options to solve this?

Also, the use of @for in the protocol implementation feels a bit odd. It’s only by accident that I found out that this is a shortcut notation to referencing the full module name (I had never seen it before).

More in general I’m curious if this is a common requirement, and whether I’ve chosen a good path to solve this requirement? If there are better choices, I’d love to learn about them. It seems like a lot of code to solve this problem, which also makes me doubt a bit.

There are still places where I pattern match on the actual implementation structs or use the is_x?/1 functions, after getting the more abstract type first. Mostly in cases where the code paths to process both are very different (I’d have to extend the protocol to get all the required data for one implemenntation, and deny a lot of it in the other implementation). But I know that is not a flexible design, and will cause problems when I add a third reference to the mix (I’ll have to chase down all those places, and make sure to pattern match on that new struct too, and handle it).
Maybe a protocol isn’t the answer after all, to this design problem of having this abstract reference, which would also be valuable feedback

Thanks for reading this far, I appreciate any feedback.

LostKobrakai · November 20, 2023, 8:20am

To me this sounds like an opportunity to split read and write model (structs). You can return data, which is not a ecto struct to all the places that read the values. No need to let all the complexity of the db level storage leak all over the codebase. Deal with it close to reading the values from the db, tranfsform into a format more useful to all readers and return that. Readers don‘t need to care.

linusdm · November 21, 2023, 4:35pm

Had to let that in, but you’re probably right. I’m still used to passing every Ecto struct straight to the front-end. This might be an example of where that breaks apart. Thanks for that suggestion!

hlx · November 21, 2023, 7:34pm

This is off-topic but I have to ask because this tripped me up : why do you mix English with Dutch/Flemish? You have new/1 (EN), voor_lezer?/1 (NL) and get_uitlener/1 (EN+NL) which cover all possibilities when it comes to mixing the two languages.

As a native Dutch speaker and someone who works in a German code base I would suggest you write everything in English or stick to either language. Even asking questions and showing you code like this would make it easier for other people to understand if it was in English.

If you have a strong reason for writing it this way, please let me know because I’m really curious about the thought behind this.

I also learned that there is something called @for so thanks for that!

linusdm · November 21, 2023, 7:51pm

I just extracted the parts relevant to this question. I am aware that native English speakers wouldn’t understand the nouns, but I figured that wasn’t important. I admit I was a bit lazy not translating everything This is not an exact extract from our codebase. Although we do keep all domain specific jargon in Dutch to avoid needless translation.
Keep in mind that’s a team decision, and it depends on who will work on the code (we’re a two-man team). I’ve never felt that mixing languages is necessarily a bad thing.