Modeling join table with attributes

Hi everyone, having some trouble understanding how to model a particular problem with Ecto. I just finished reading the first part of Programming Ecto, and skimmed a few chapters of the second part looking for relevant info but came up short. Ditto for the Ecto docs and the cheatsheet, which is really good, but still can’t figure out the answer myself.

I’ve got the following initial migration (a port of another project of mine in another language):

defmodule Repo.Migrations.Genesis do
  use Ecto.Migration

  def change do
    create table("persons") do
      add :code, :string, null: false
      add :name, :string, null: false
      timestamps()
    end
    create unique_index("persons", [:code])

    create table("expenses") do
      add :date, :date, null: false
      add :description, :string, null: false
      timestamps()
    end

    create table("persons_expenses") do
      add :person_id, references("persons"), null: false
      add :expense_id, references("persons"), null: false
      add :amount, :decimal, null: false
      timestamps()
    end
    create unique_index("persons_expenses", [:person_id, :expense_id])
  end
end

Now in terms of schemas, I’m unsure of what to do. In the Person schema I’d like to have the list of its corresponding expenses+amount; and in the Expense schema I’d like to have the list of its corresponding persons+amount.

defmodule Person do
  use Ecto.Schema

  schema "persons" do
    field :code, :string
    field :name, :string
    timestamps()

    has_many :expenses, Expense # though: [...] maybe?
  end
end

defmodule Expense do
  use Ecto.Schema

  schema "expense" do
    field :date, :date
    field :description, :string
    timestamps()

    has_many :person_expense, PersonExpense
  end
end

defmodule PersonExpense do
  use Ecto.Schema

  schema "person_expense" do
    field :amount, :decimal
    timestamps()

    belongs_to :person, Person
    belongs_to :expense, Expense
  end
end

Is the above the best I can do? It somehow doesn’t feel right. How you model this instead? Any help greatly appreciated!

So this is a case where multiple people can be tied to the same expense with different amounts?

If so, this looks about right, yes.

You would generally use :through if person_expense was a pure join table, but it’s not because it includes the amount. So you would want to load the PersonExpenses for each person or expense and then preload/join the other side of the operation. So you would end up with:

# Let Ecto preload (multiple round-trips)
from p in Person,
  preload: [person_expenses: [:expense]]

# Preload manually with joins (one round-trip)
from p in Person,
  join: pe in assoc(p, :person_expenses),
  join: e in assoc(pe, :expense),
  preload: [person_expenses: {pe, [expense: e]}]

# Result
%Person{expenses: [
  %PersonExpense{amount: 100, expense: %Expense{}},
  # ...
]}

See the docs for preload/3.

There are ways you could put the Expense’s fields onto the PersonExpense as virtual fields and then manually add them in the select, and that is also okay depending on the use-case. See the docs for select/3.

1 Like

OP should note that preloading in this way can be very expensive. Sure it only does it in one query but that means it gets normalized in Elixir which can be incredibly slow on larger dataset with more joins.

2 Likes

Exactly.

That’s the only option I could imagine implementing with what I know. I kind of didn’t want to do that, but I can probably live with it. Let’s see how the rest of project goes.

1 Like

Thanks for the heads up! With that in mind I can probably work around it or avoid it completely.

The two queries I showed have different performance tradeoffs which are documented here.

Keep in mind that he is talking about a query that returns a large number of rows. I don’t know exactly what your application does but it sounds like the sort of thing that would have maybe 10-20 “people” per “expense”, and if your numbers are in that ballpark it just does not matter at all.

If you have e.g. 1000 rows returned then you can start to think about optimizing your queries more carefully, like aggregating in the DB for example.

Of course idk how much experience you have with RDBMS, but in general this is just kinda “how things are done”. It’s not particularly specific to Ecto.

You can always write more specific queries for specific things, like “what is the sum of amount for this Expense”. But whether that’s worth another round-trip depends on context.