How do you model different ‘types’ with totally different fields in Ecto?

I’m trying to model something in Ecto where records share a few common fields, but depending on a type, they have completely different additional fields.

Roughly:

  • there’s an items table with the common stuff (type, same_field, etc.)
  • and then separate tables for each type (items_a, items_b)
  • these subtype tables use the same id as the parent (so 1:1, PK = FK)

So everything is still one logical collection, just split across tables to avoid a lot of nullable columns and to keep proper DB constraints.

In Ecto I’m using has_one + cast_assoc, and dispatching based on the type.


I’m wondering:

  • does this approach fit well with Ecto, or is it something people generally avoid?
  • is there a more idiomatic way to solve this without collapsing everything into one table or using JSON?
  • are there any gotchas with this setup (especially around inserts or preloading)?

Example changeset:

def changeset(item, attrs) do
item
|> cast(attrs, [:type, :category, :owner_id])
|> validate_required([:type, :category, :owner_id])
|> cast_subtype()
end

The whole idea is to keep validation at the database level as much as possible, not just in changesets.

1 Like

I’m slightly confused by your wording… items_a and items_b, the plural makes me think it’s a has_many, but you then you said hash_one so I’m assuming that. SO, I actually have a mostly working on a library to handle this more gracefully as we had a few tables like that at my old work. But since I stopped working there I lost motivation. I could post what I have, though. The API is basically:

schema "items" do
  field :foo, :string

  one :item do
    belongs_to :item_one, Item_1
    belongs_to :item_two, Item_2
    belongs_to :item_three, Item_3
  end

  def changeset(item, attrs) do
    item
    |> cast(attrs, [:foo])
    |> cast_one_assoc(:item)
  end
end

It basically adds sister versions of a few functions with _one added (kinda like _embeds versions), so cast_one, put_one_)assoc, and get_one_assoc (for each you can refer to them by the first argument to the one macro). There is also MyRepo.preload_one (which is a stupid name and I think that function is unfinished for nested preloads). I can post what I have if you’re interested.

Otherwise, as far as preloading goes, you can do this and it’s efficient:

Repo.preload(items, [:item_a, :item_b, :item_c)

When preloading it will simply ignore the associations whose ids are nil. I would definitely hide this in a function, though, as they get unweildly if you are preloading a lot.

2 Likes

Yeah, good point about the naming — that’s on me. it’s has_one in practice.

Your one :item abstraction is actually very close to what I’m doing manually.

I’d definitely be interested in seeing it, especially how you handled:

  • dispatching based on type
  • and preloading (that part gets messy fast)

Also curious: did you end up using this pattern long-term, or did you move away from it?


I’m definitely interested, mostly from a learning perspective.

This approach feels “right” to me so far — especially compared to pushing everything into a JSON field. Even if it means a few extra tables and joins, I like that it keeps things well-structured and avoids nullable fields.

I should also mention that I haven’t really worked with macros yet, so seeing how you approached that part would be especially interesting.

On the UI side, I’ve also tried modeling this in LiveView forms using function components for each subtype, so each type has its own set of inputs. It works, but after a while it starts to feel a bit cumbersome to maintain.

1 Like

Sure thing! I’ll clean it up and push it a bit later and let you know.

Note I made a mistake in the API I showed which I’ve edited. I just forgot the belongs_tos so the one macro is doesn’t do anything too magical. It’s has a lot in common with how test works which is outlined in the official guides.

And yes, preloading gets hairy if they are deeply nested, but it’s just a wrapper around Repo.preload that looks for :item and replaces it with the proper preload. Of course, it’s not that simple because it actually has to reflect on the schema to ensure that :item in a preload really does mean the one :item since it could refer to something else in a nested preload. That’s where it gets messy.

I’m sure my old team has not moved from this pattern (I only left a few weeks ago, lol) and ya, I don’t think there is anything wrong with it. Ecto very specifically does not support polymorphic associations with a type column.

On speaking of that, my lib allows for adding either a virtual or concrete type column which can come in handy in pattern matching and doing all queries for a specific type respectively.

I’ll let you know when I’ve posted it.

1 Like

I recently read a good blog post on this:

This is about the plain db level. Once you have that you can add the schemas for that.

You likely want to have another layer of this eventually, which can map between [ItemA, ItemB] to those lower level nested schemas.

10 Likes

I really like this, thanks for sharing. It’s quite clever and a pretty clean solution.

As promised here is the repo. I’m a bit embarrassed by the preload_one as I was in the middle of working on it (this was made with 0 vibes) but I keep hopping around on projects and I need to stop doing that so here ya go!

I think the approach linked by LK is a much more interesting way of doing this, although the same Ecto API could still apply.

Also the README is not fleshed out at all but the all the docstrings are.

This is a non-answer but something I learned to do early on in a project is not even worry about it and just have all the columns on the one table and then document the schema which fields belong to which subtype and then have separate changesets for creating the different subtypes.

This is usually good enough until a better understanding of requirements show themselves.

This is the strategy my library was trying to solve. We did just this at my last job but it ballooned very quickly. EDIT: By “solve” I mean make easier, but same idea: it’s when you put all the sub types right on the parent table and only allow one of them to be present.

1 Like

Whats easy in SQL are hard in Ecto.

Thats why I’m building GitHub - elixir-dbvisor/sql: SQL provides state-of-the-art, high-performance SQL integration for Elixir, built to handle extreme concurrency with unmatched expressiveness and ergonomic query composition. Write safe, composable, parameterized queries directly, without translating to Ecto or any ORM. · GitHub to make it easy both places.

3 Likes

Oh ya, I keep meaning to check this out. It looks really nice I just haven’t played with it yet as there are a bunch of things I wanna do (and someone recently starting to use AI a bit has made that worse). I’ve also been heavy into an Ash project and I’m ass-u-me’ing it wouldn’t make much sense to use in Ash?

Then there’s no time to refactor the database schema. I always encourage people to not half-ass the database design. Do it properly from the get go. It’s like in sports. If you lack the fundamentals, it’s very hard to compensate afterwards.

5 Likes

It depends Ash can’t do everything and you will eventually have to write SQL, I never worked on a project that never had to do that at somepoint. I’m looking forward to tackle migrations next as thats when are going to take a fresh look at what a Ecto.Schema or Ash.Resource is in the world of a relational database is, as we can do things they have a harder time doing.

Most DSLs ends up being tightly coupled. In SQL we flip it upside down. You have access to the whole row when transforming in Elixir and SQL queries stays in the database.

Although we still got some ways to reduce verbosity, but hopefully thats where the community will chime in with better ideas then I have, I’m more of a lets get the engineering fundamentals right, then everything else becomes easy to build on top of.

3 Likes

I definitely agree with the sentiment but I’d be careful not to confuse design w/ implementation. The design decision was sub-typing, single table vs many tables is implementation. “No time to refactor” holds true in the other direction; if you over complicate things, you’ll be stuck with it. That’s why I said “when starting new projects”, alluding to not knowing exactly what’s necessary. May end up with only 3 sub-type specific columns and no sub-type specific relations. I don’t want to hijack the thread though as this is off-topic from the original post because I agree with what you said

Edit: some clarifications

This is one way I’ve tried that seems convenient in ecto. In the todo list example, it would look something like this:

  schema "items" do
    field :description, :string
    # join with either a join schema or just a raw join table
    many_to_many :lists, Todo.List, join_through: Todo.ListItem
    many_to_many :projects, Todo.Project, join_through: "projects_items"

    field :itemable, :map, virtual: true

    timestamps(type: :utc_datetime)
  end

  schema "lists" do
    field :title, :string
    many_to_many :items, Todo.Items.Item, join_through: Todo.ListItem

    timestamps(type: :utc_datetime)
  end

  schema "projects" do
    field :name, :string
    many_to_many :items, Todo.Items.Item, join_through: "projects_items"

    timestamps(type: :utc_datetime)
  end

In a page that lists all items, I do this:

  # context functions
  def get_items_with_itemable do
    Item
    |> Repo.all()
    |> Repo.preload([:lists, :projects])
    |> Enum.map(&add_itemable/1)
  end

  defp add_itemable(nil), do: nil

  defp add_itemable(%Item{} = item) do
    Map.put(item, :itemable, get_itemable(item))
  end

  defp get_itemable(%Item{} = item) do
    list = List.first(item.lists)
    project = List.first(item.projects)

    list || project
  end

  # live view
  def render(assigns) do
    ~H"""
    <Layouts.app flash={@flash}>
      <ul>
        <li :for={i <- @items}>
          {"#{display_itemable(i.itemable)} - #{i.description}"}
        </li>
      </ul>
    </Layouts.app>
    """
  end

  defp display_itemable(%Todo.List{title: title}), do: title
  defp display_itemable(%Todo.Project{name: name}), do: name
  defp display_itemable(_), do: ""

Thanks everyone for the responses!


Context / why I asked

While learning Ecto / Phoenix / LiveView, I ran into this modeling dilemma and was trying to understand the different layers at the same time.

Before the full mental model was really in place, I was jumping between different layers due to the learning process.


Why I didn’t immediately propose my own solution

I also explored the direction linked by LostKobrakai earlier (separate types, separate structures), but what held me back was:

  • at the database level, I modeled this using composite-key / multi-field identity approaches
  • in Ecto, this is not really supported in a native or idiomatic way
  • so it wasn’t clear to me whether this approach is actually considered idiomatic in an Ecto context

Ecto vs data modeling

What I found interesting is that Ecto + LiveView changesets work really well together, especially for dynamic forms.

If I completely drop Ecto, I don’t just lose an ORM, but also a very useful validation + UI integration layer.

At the same time, it became increasingly clear to me that the database model and the Ecto schema are not the same thing.

At the database level, you can have a perfectly valid relational model (e.g. composite-key / subtype-style structure), which Ecto doesn’t necessarily map in a natural way.


Concrete example (simplified)

schema "items" do
  field :type, Ecto.Enum, values: [:item_one, :item_two]

 has_one :item_one
 has_one :item_two

end

And the more specific schemas:

@primary_key {:id, :id, autogenerate: false}
schema "item_one" do
 field :type, Ecto.Enum, values: [:item_one], default: :item_one

 belongs_to :item, Item
end
@primary_key {:id, :id, autogenerate: false}
schema "item_two" do
 field :type, Ecto.Enum, values: [:item_two], default: :item_two

 belongs_to :item, Item
end

@sodapopcan

This is a really interesting approach — I find the idea of abstracting the association into a single “slot” quite compelling, especially as the number of possible underlying tables grows.

In my case the domain already tends toward multiple item types, so avoiding a growing number of empty or optional has_one associations is exactly the kind of problem I was running into.

Curious to know what kind of thing you have in mind here.

1 Like

Most of my thoughts has been that we have always done things where we try to create a DSL to support multiple datastore, instead of embracing the language that already is there. SQL can be first class, adaptable and specialized at the same time. Elixir is uniquely positioned as it is extensible and you can implement foreign language integration with ease, something I’ve never seen done in other languages, which might be why we’ve been dealing with ORMs for over thirty years. Maybe it’s time to think differently?

1 Like

Ash isn’t strictly speaking an ORM although it does quack like that duck for most applications using it :laughing:

If all we’re talking about is running queries and getting their results in Elixir structs and that’s the most that you want out of a given tool then its a different discussion, but with Ash we’re much more concerned with generalized modeling problems, or application building primitives and running SQL queries is just “one way” you might do that.

Some examples:

policies do
  policy action_type(:read) do
    authorize_if expr(public == true)
  end
end

If you had a policy like this, Ash will lower it into SQL when running the query:

Ash.read!(Resource)
#=> SELECT * FROM "table" WHERE public = TRUE

But you can also leverage that policy for a lot more things. For example, you can ask for a policy flow chart to hand over for a security audit. Or you can evaluate the policy in memory:

Ash.can?(Resource, :read, %User{}, 
  data: %Resource{public: false}, 
  reuse_values?: true # use the values we have in memory
)

In which case Ash will run the given expression in memory.

Or, for example, the following validation:

update :increment_score do
  argument :amount, :integer, default: 1

  change atomic_update(:score, expr(score + ^arg(:amount)))
  validate compare(:score, less_than: 100)
end

When you run this action, you will get SQL like this:

UPDATE things 
SET
  id = CASE WHEN
    score + {amount} > 100 THEN
      ash_raise_error(...) -- We raise an equivalent error
    ELSE
      id
    END
  score = score + {amount}
WHERE id = {id}
AND public = true -- policies embedded into queries

Ash is far more than an ORM these are just some examples of that (and likely not really the best ones).
In every non-trivial application I’ve ever worked with ultimately your business logic crosses multiple systems, or multiple queries etc. that must be operated on in concert.

If composition is limited to statements to a data layer, you’re constrained by the capabilities of that data layer. Likewise what Ash brings to the table primarily has to do with a whole suite of intelligent composition that may result in many queries, or even zero queries when running actions.


Even with that in mind my question was more about what concrete limitations you see in Ash that could be otherwise addressed by something more SQL-native as it could easily represent gaps in Ash. With that said, Ash isn’t intended to replace all interaction with a database, it’s about your domain logic. This is perfectly valid Ash code:

action :list_users, {:array, :struct} do
  constraints items: [instance_of: __MODULE__]

  run fn input, _ -> 
    result =
      ~SQL[from users select *]
      |> SQL.map(fn row -> struct(User, row) end)
      |> Enum.to_list()
  
    {:ok, result} 
  end
end
1 Like

No doubt that Ash is more then an ORM, the reason why I mentioned Ash was mostly due to the work around automated migration which Ecto is lacking.

The biggest issues I find with traditional ORMs and DSLs in general is lack of precision. I believe that the abstraction is done at the wrong level, which ends up causing more complexity and issues overtime, as the abstraction leaks.