How do you model different ‘types’ with totally different fields in Ecto?

I’m trying to model something in Ecto where records share a few common fields, but depending on a type, they have completely different additional fields.

Roughly:

  • there’s an items table with the common stuff (type, same_field, etc.)
  • and then separate tables for each type (items_a, items_b)
  • these subtype tables use the same id as the parent (so 1:1, PK = FK)

So everything is still one logical collection, just split across tables to avoid a lot of nullable columns and to keep proper DB constraints.

In Ecto I’m using has_one + cast_assoc, and dispatching based on the type.


I’m wondering:

  • does this approach fit well with Ecto, or is it something people generally avoid?
  • is there a more idiomatic way to solve this without collapsing everything into one table or using JSON?
  • are there any gotchas with this setup (especially around inserts or preloading)?

Example changeset:

def changeset(item, attrs) do
item
|> cast(attrs, [:type, :category, :owner_id])
|> validate_required([:type, :category, :owner_id])
|> cast_subtype()
end

The whole idea is to keep validation at the database level as much as possible, not just in changesets.

1 Like

I’m slightly confused by your wording… items_a and items_b, the plural makes me think it’s a has_many, but you then you said hash_one so I’m assuming that. SO, I actually have a mostly working on a library to handle this more gracefully as we had a few tables like that at my old work. But since I stopped working there I lost motivation. I could post what I have, though. The API is basically:

schema "items" do
  field :foo, :string

  one :item do
    belongs_to :item_one, Item_1
    belongs_to :item_two, Item_2
    belongs_to :item_three, Item_3
  end

  def changeset(item, attrs) do
    item
    |> cast(attrs, [:foo])
    |> cast_one_assoc(:item)
  end
end

It basically adds sister versions of a few functions with _one added (kinda like _embeds versions), so cast_one, put_one_)assoc, and get_one_assoc (for each you can refer to them by the first argument to the one macro). There is also MyRepo.preload_one (which is a stupid name and I think that function is unfinished for nested preloads). I can post what I have if you’re interested.

Otherwise, as far as preloading goes, you can do this and it’s efficient:

Repo.preload(items, [:item_a, :item_b, :item_c)

When preloading it will simply ignore the associations whose ids are nil. I would definitely hide this in a function, though, as they get unweildly if you are preloading a lot.

2 Likes

Yeah, good point about the naming — that’s on me. it’s has_one in practice.

Your one :item abstraction is actually very close to what I’m doing manually.

I’d definitely be interested in seeing it, especially how you handled:

  • dispatching based on type
  • and preloading (that part gets messy fast)

Also curious: did you end up using this pattern long-term, or did you move away from it?


I’m definitely interested, mostly from a learning perspective.

This approach feels “right” to me so far — especially compared to pushing everything into a JSON field. Even if it means a few extra tables and joins, I like that it keeps things well-structured and avoids nullable fields.

I should also mention that I haven’t really worked with macros yet, so seeing how you approached that part would be especially interesting.

On the UI side, I’ve also tried modeling this in LiveView forms using function components for each subtype, so each type has its own set of inputs. It works, but after a while it starts to feel a bit cumbersome to maintain.

1 Like

Sure thing! I’ll clean it up and push it a bit later and let you know.

Note I made a mistake in the API I showed which I’ve edited. I just forgot the belongs_tos so the one macro is doesn’t do anything too magical. It’s has a lot in common with how test works which is outlined in the official guides.

And yes, preloading gets hairy if they are deeply nested, but it’s just a wrapper around Repo.preload that looks for :item and replaces it with the proper preload. Of course, it’s not that simple because it actually has to reflect on the schema to ensure that :item in a preload really does mean the one :item since it could refer to something else in a nested preload. That’s where it gets messy.

I’m sure my old team has not moved from this pattern (I only left a few weeks ago, lol) and ya, I don’t think there is anything wrong with it. Ecto very specifically does not support polymorphic associations with a type column.

On speaking of that, my lib allows for adding either a virtual or concrete type column which can come in handy in pattern matching and doing all queries for a specific type respectively.

I’ll let you know when I’ve posted it.

1 Like

I recently read a good blog post on this:

This is about the plain db level. Once you have that you can add the schemas for that.

You likely want to have another layer of this eventually, which can map between [ItemA, ItemB] to those lower level nested schemas.

10 Likes

I really like this, thanks for sharing. It’s quite clever and a pretty clean solution.

As promised here is the repo. I’m a bit embarrassed by the preload_one as I was in the middle of working on it (this was made with 0 vibes) but I keep hopping around on projects and I need to stop doing that so here ya go!

I think the approach linked by LK is a much more interesting way of doing this, although the same Ecto API could still apply.

Also the README is not fleshed out at all but the all the docstrings are.

This is a non-answer but something I learned to do early on in a project is not even worry about it and just have all the columns on the one table and then document the schema which fields belong to which subtype and then have separate changesets for creating the different subtypes.

This is usually good enough until a better understanding of requirements show themselves.

This is the strategy my library was trying to solve. We did just this at my last job but it ballooned very quickly. EDIT: By “solve” I mean make easier, but same idea: it’s when you put all the sub types right on the parent table and only allow one of them to be present.

1 Like

Whats easy in SQL are hard in Ecto.

Thats why I’m building GitHub - elixir-dbvisor/sql: SQL provides state-of-the-art, high-performance SQL integration for Elixir, built to handle extreme concurrency with unmatched expressiveness and ergonomic query composition. Write safe, composable, parameterized queries directly, without translating to Ecto or any ORM. · GitHub to make it easy both places.

3 Likes

Oh ya, I keep meaning to check this out. It looks really nice I just haven’t played with it yet as there are a bunch of things I wanna do (and someone recently starting to use AI a bit has made that worse). I’ve also been heavy into an Ash project and I’m ass-u-me’ing it wouldn’t make much sense to use in Ash?

Then there’s no time to refactor the database schema. I always encourage people to not half-ass the database design. Do it properly from the get go. It’s like in sports. If you lack the fundamentals, it’s very hard to compensate afterwards.

5 Likes

It depends Ash can’t do everything and you will eventually have to write SQL, I never worked on a project that never had to do that at somepoint. I’m looking forward to tackle migrations next as thats when are going to take a fresh look at what a Ecto.Schema or Ash.Resource is in the world of a relational database is, as we can do things they have a harder time doing.

Most DSLs ends up being tightly coupled. In SQL we flip it upside down. You have access to the whole row when transforming in Elixir and SQL queries stays in the database.

Although we still got some ways to reduce verbosity, but hopefully thats where the community will chime in with better ideas then I have, I’m more of a lets get the engineering fundamentals right, then everything else becomes easy to build on top of.

3 Likes

I definitely agree with the sentiment but I’d be careful not to confuse design w/ implementation. The design decision was sub-typing, single table vs many tables is implementation. “No time to refactor” holds true in the other direction; if you over complicate things, you’ll be stuck with it. That’s why I said “when starting new projects”, alluding to not knowing exactly what’s necessary. May end up with only 3 sub-type specific columns and no sub-type specific relations. I don’t want to hijack the thread though as this is off-topic from the original post because I agree with what you said

Edit: some clarifications

This is one way I’ve tried that seems convenient in ecto. In the todo list example, it would look something like this:

  schema "items" do
    field :description, :string
    # join with either a join schema or just a raw join table
    many_to_many :lists, Todo.List, join_through: Todo.ListItem
    many_to_many :projects, Todo.Project, join_through: "projects_items"

    field :itemable, :map, virtual: true

    timestamps(type: :utc_datetime)
  end

  schema "lists" do
    field :title, :string
    many_to_many :items, Todo.Items.Item, join_through: Todo.ListItem

    timestamps(type: :utc_datetime)
  end

  schema "projects" do
    field :name, :string
    many_to_many :items, Todo.Items.Item, join_through: "projects_items"

    timestamps(type: :utc_datetime)
  end

In a page that lists all items, I do this:

  # context functions
  def get_items_with_itemable do
    Item
    |> Repo.all()
    |> Repo.preload([:lists, :projects])
    |> Enum.map(&add_itemable/1)
  end

  defp add_itemable(nil), do: nil

  defp add_itemable(%Item{} = item) do
    Map.put(item, :itemable, get_itemable(item))
  end

  defp get_itemable(%Item{} = item) do
    list = List.first(item.lists)
    project = List.first(item.projects)

    list || project
  end

  # live view
  def render(assigns) do
    ~H"""
    <Layouts.app flash={@flash}>
      <ul>
        <li :for={i <- @items}>
          {"#{display_itemable(i.itemable)} - #{i.description}"}
        </li>
      </ul>
    </Layouts.app>
    """
  end

  defp display_itemable(%Todo.List{title: title}), do: title
  defp display_itemable(%Todo.Project{name: name}), do: name
  defp display_itemable(_), do: ""

Thanks everyone for the responses!


Context / why I asked

While learning Ecto / Phoenix / LiveView, I ran into this modeling dilemma and was trying to understand the different layers at the same time.

Before the full mental model was really in place, I was jumping between different layers due to the learning process.


Why I didn’t immediately propose my own solution

I also explored the direction linked by LostKobrakai earlier (separate types, separate structures), but what held me back was:

  • at the database level, I modeled this using composite-key / multi-field identity approaches
  • in Ecto, this is not really supported in a native or idiomatic way
  • so it wasn’t clear to me whether this approach is actually considered idiomatic in an Ecto context

Ecto vs data modeling

What I found interesting is that Ecto + LiveView changesets work really well together, especially for dynamic forms.

If I completely drop Ecto, I don’t just lose an ORM, but also a very useful validation + UI integration layer.

At the same time, it became increasingly clear to me that the database model and the Ecto schema are not the same thing.

At the database level, you can have a perfectly valid relational model (e.g. composite-key / subtype-style structure), which Ecto doesn’t necessarily map in a natural way.


Concrete example (simplified)

schema "items" do
  field :type, Ecto.Enum, values: [:item_one, :item_two]

 has_one :item_one
 has_one :item_two

end

And the more specific schemas:

@primary_key {:id, :id, autogenerate: false}
schema "item_one" do
 field :type, Ecto.Enum, values: [:item_one], default: :item_one

 belongs_to :item, Item
end
@primary_key {:id, :id, autogenerate: false}
schema "item_two" do
 field :type, Ecto.Enum, values: [:item_two], default: :item_two

 belongs_to :item, Item
end

@sodapopcan

This is a really interesting approach — I find the idea of abstracting the association into a single “slot” quite compelling, especially as the number of possible underlying tables grows.

In my case the domain already tends toward multiple item types, so avoiding a growing number of empty or optional has_one associations is exactly the kind of problem I was running into.