How to handle schemas polymorphism in Phoenix?

tmattio · March 23, 2018, 2:42pm

Hi,

I’ve already posted this question on Stack Overflow, but it didn’t receive much attention: https://stackoverflow.com/questions/49328342

Basically, I see a few ways to handle schemas polymorphism (w/ a belongs_to association), and I was wondering which one was recommended, and why.

Here is the question:

The recommended way to handle polymorphic associations in Phoenix seems to be adding an intermediate schema that contains the references to the other schemas:

So if I would like to create schemas with different kinds of animals, I would do:

defmodule Animal do
  use Ecto.Model

  schema "animals" do
    belongs_to(:dog, Dog)
    belongs_to(:cat, Cat)
    belongs_to(:owner, PetOwner)
  end
end

defmodule Dog do
  use Ecto.Model

  schema "dogs" do
  end
end

defmodule Cat do
  use Ecto.Model

  schema "cats" do
  end
end

defmodule PetOwner do
  use Ecto.Model

  schema "pet_owners" do
    has_one(:pet, Animal)
  end
end

But I could also have the PetOwner schemas containing a binary field and the type:

defmodule Dog do
  use Ecto.Model

  schema "dogs" do
  end
end

defmodule Cat do
  use Ecto.Model

  schema "cats" do
  end
end

defmodule PetOwner do
  use Ecto.Model

  schema "pet_owners" do
    field(:pet, :binary)
    field(:pet_type, :integer)
  end
end

Or even just having a nullable reference to all the animals in the owner schema:

defmodule Dog do
  use Ecto.Model

  schema "dogs" do
    belongs_to(:owner, PetOwner)
  end
end

defmodule Cat do
  use Ecto.Model

  schema "cats" do
    belongs_to(:owner, PetOwner)
  end
end

defmodule PetOwner do
  use Ecto.Model

  schema "pet_owners" do
    has_one(:cat, Cat)
    has_one(:dog, Dog)
  end
end

The first method seems to add complexity to the schemas. What are the pros and cons of the different methods?

EDIT: Let’s assume that a pet owner can own only one pet, if the schema allows multiple pet, the verification is done in the changeset.

kokolegorille · March 23, 2018, 2:56pm

Are You sure You want to use polymorphism? It looks like Single Table Inheritance,

A PetOwner can have multiples animals, animals belongs to PetOwner. Animals can be cats, dogs, whatever.

Animals would have… id, pet_owner_id, type etc.

dogs is a subset of animals, as cats…

Also, it seems to be Phoenix 1.2, new version use Schema instead of Model?

There is a post about STI it here.

OvermindDL1 · March 23, 2018, 2:57pm

Also, anything related to database work is not a phoenix thing, it’s a database thing, so you should actually be asking How would I model <this> in PostgreSQL?, as it is entirely unrelated to Phoenix anyway.

tmattio · March 23, 2018, 3:10pm

Hi @kokolegorille,

This is exactly the same problem indeed! Thank you for referencing it.

The reason I didn’t consider this solution is because in my specific case, I have dozens of subtypes, and I figured that I would prefer adding some logic than having a table with hundreds of fields, almost all of them being empty.

So that’s where this question comes in: which specific implementation would be recommended to have separate tables for each subtypes?

tmattio · March 23, 2018, 3:14pm

Hi @OvermindDL1

You’re entirely right, it’s more a question of database design than a Phoenix problem.

Admittedly, I am new to Elixir and Phoenix and I thought asking this question in the context of Phoenix and using Elixir code would help be to build the bridge between the “theoretical” design of the database and how to implement it with Ecto.

kokolegorille · March 23, 2018, 3:16pm

I would have a look at ecto abstract table.

Here is a post of @LostKobrakai

https://medium.com/@lostkobrakai/phoenix-contexts-and-ecto-abstract-tables-e6b4de93edf

Here is docs about it

https://hexdocs.pm/ecto/2.2.6/Ecto.Schema.html#belongs_to/3-polymorphic-associations

From docs

defmodule Comment do
  use Ecto.Schema

  schema "abstract table: comments" do
    # This will be used by associations on each "concrete" table
    field :assoc_id, :integer
  end
end

Alternatively, because Ecto does not tie a schema to a given table, we can use separate tables for each association.

OvermindDL1 · March 23, 2018, 3:23pm

Nah, Ecto tries to be as near to SQL as possible while doing some type checking and validation and such, so seeing how to do it in SQL is precisely what you want to do.

tmattio · March 23, 2018, 3:27pm

@kokolegorille Thanks again for the references!

The Ecto’s documentation link has been really helpful to understand what I was trying to do (I actually linked it in my question), and if I have to rephrase my question, it could be:

How to do polymorphic association as described in Ecto’s documentation, but with a reversed association?

In the documentation, the comment table references polymorphic table (so a polymorphic table has many comment), what I want to do is exactly the opposite: a concrete table has many polymorphic table.

Do you think Ecto’s abstract table concept applies in this case?

kokolegorille · March 23, 2018, 3:39pm

I would mix both.

animals
:id
:pet_owner_id
:type

type => :gorilla, :cats, :dogs

animal_characteristics => abstract

gorilla_characteristics
dog_characteristics
cat_characteristcs

pet_owner has_many animals
gorilla is an animal
gorilla has_one gorilla_characteristics
etc.

That way, each animal characteristics has its own table. And animals is just a STI link between pet_owner and animal characteristics.

Or even simpler… use a jsonb field for storing animal characteristics.

tmattio · March 23, 2018, 4:09pm

That’s great, thank you!

I am trying to implement what you describe with Ecto, but I have a hard time transferring the associations to actual code: how would you implement gorilla is an animal?

Also, using an id and a type does not seem to be recommended. From Ecto’s documentation:

The problem with this approach is that it breaks references in the database. You can’t use foreign keys and it is very inefficient, both in terms of query time and storage.

Using JSON would be great, I actually started the project using MongoDB. Thing is, I am going with Elm and GraphQL for my stack and I would loose a lot of benefits by not having typed schemas

kokolegorille · March 23, 2018, 4:12pm

It is not a polymorphic association, it is a sti table. There is no link done like polymorphic Rails association on it. The link is on pet_owner_id, and the other is done with animal_id on each characteristics table.

OvermindDL1 · March 23, 2018, 4:14pm

Uh, does that even make sense in a SQL world? o.O
That sounds very OOP’y. You should instead favor a composition style as that is how SQL works.

tmattio · March 23, 2018, 4:15pm

I understand!

I feel like what you’re suggesting is really close to the first example in the initial question:

defmodule Animal do
  use Ecto.Schema

  schema "animals" do
    belongs_to(:dog, Dog)
    belongs_to(:cat, Cat)
    belongs_to(:owner, PetOwner)
  end
end

defmodule Dog do
  use Ecto.Schema

  schema "dogs" do
  end
end

defmodule Cat do
  use Ecto.Schema

  schema "cats" do
  end
end

defmodule PetOwner do
  use Ecto.Schema

  schema "pet_owners" do
    has_one(:pet, Animal)
  end
end

What would be the difference between this and your approach?

kokolegorille · March 23, 2018, 4:22pm

In Your example, animals belongs to many things. So it would need to have dog_id, cat_id etc.

But in my example, each concrete characteristics table is holding an animal_id

tmattio · March 23, 2018, 4:22pm

Might be why I couldn’t figure out how to implement it haha

Given the last few posts, I think the recommended way to do it in SQL would be:

create table animal(
  id serial primary key,
  dog_id integer references dog,
  cat_id integer references cat,
  check(
    (
      (dog_id is not null)::integer +
      (cat_id is not null)::integer +
    ) = 1
  )
);

create unique index on animal (dog_id) where dog_id is not null;
create unique index on animal (cat_id) where cat_id is not null;

See this post for reference: https://hashrocket.com/blog/posts/modeling-polymorphic-associations-in-a-relational-database

Does it make sense to you, or would you recommend another approach?

OvermindDL1 · March 23, 2018, 4:28pm

That is composition so that entirely makes sense.

Just have foreign references on the extra data tables that point back to the main ‘animal’ table and it is easy to query/left_join across them all. I’d personally not put the foreign data tables on the animal table as that is less extensible unless the extra data table has a lot of duplication, in which case then yes it can be worth it (or a middle join table).

tmattio · March 23, 2018, 4:55pm

Thank you for your input!

I’ve just read about this approach. Looks like it’s called Exclusive Arc, and it does not seem to be recommended (Database development mistakes made by application developers - Stack Overflow):

An exclusive arc is a common mistake where a table is created with two or more foreign keys where one and only one of them can be non-null. Big mistake. For one thing it becomes that much harder to maintain data integrity. After all, even with referential integrity, nothing is preventing two or more of these foreign keys from being set (complex check constraints notwithstanding).

Any opinion on this? I feel like the “rightness” of this approach depends on implementation detail, and I am not sure the implementation I gave as an example could be considered “right”.

tmattio · March 23, 2018, 5:42pm

Just to reframe the discussion a little bit:

I am not looking for one way to implement inverse polymorphic associations, I already provided some implementations in the initial question. Instead, I’d like to get a sense of the pros and cons of each implementation, and eventually have a proposal for a recommended implementation with a list of reasons why it is recommended.

I feel like it’s a question that comes back frequently in Ecto’s community:

And so far, I haven’t been able to find a good answer to this question. I hope we can bring a definitive answer with this post

EDIT: @kokolegorille suggestion to use an abstract table seemed like a good one. What would an actual implementation look like and what would be the advantages compared to other approaches?

LostKobrakai · March 23, 2018, 5:59pm

Abstract tables are nice if you want to use the same schema but have it belong to different parents with an own table for each parent. There’s not really much to it in terms of implementation. Put “abstract table: element” in your schema’s source string and use the has_one :no_longer_abstract_element, {"real_table", Element}, foreign_key: :assoc_id notation to link it to the parent. So Element is the abstract schema, which is saved for this parent in the table real_table. A different parent would have a different table.

I feel like those questions often pop up because people expect ecto to solve the involved complexity for them (happened to me as well when starting with ecto), which it just doesn’t do.

tmattio · March 23, 2018, 6:14pm

Hi @LostKobrakai,

Thank you very much for your explanation!

It’s indeed very nice and simple. The example in the documentation is straight forward.
I’m trying to do the opposite of what is described in the documentation though, that is, have the same “parent” belonging to different polymorphic tables - maybe we can call it has_many polymorphic association, as opposed to “belong_to polymorphic association”?

I am really open to a complex solution! I am just trying to:
1 - Have a description or example of this solution
2 - Know why this is a good solution

Hope it helps to understand my question better