Ecto - Category tree

Hi all,

I’m trying to build a category tree based on this schema:

schema "categories" do
  field :list_order, :integer
  field :name, :string

  has_many :children, Category, foreign_key: :parent_id
  belongs_to :parent, Category
  
  timestamps()
end

Both parent and children are optional so here are my changeset functions:

def changeset(%Category{} = category, %{parent: _} = attrs) do
  category
  |> cast(attrs, [:name, :list_order])
  |> validate_required([:name])
  |> unique_constraint(:name)
  |> put_assoc(:parent, attrs.parent)
end

def changeset(%Category{} = category, attrs) do
  category
  |> cast(attrs, [:name, :list_order])
  |> validate_required([:name])
  |> unique_constraint(:name)
end  

There must be a better way to write this, I’d like to avoid code duplication.
How would you do ? Should I manage the association myself and write something like this (keeping only 1 function):

def changeset(%Category{} = category, attrs) do
  category
  |> cast(attrs, [:name, :list_order, :parent_id])
  |> validate_required([:name])
  |> unique_constraint(:name)
end

Thanks

I’m not sure why you need two changesets, if the parent ID is nullable in the Database it should just work. Hier is a category tree schema that we had:

defmodule Category do
  @moduledoc false
  use Ecto.Schema
  import Ecto.Changeset
  
  alias Helper.Changeset, as: ChangesetHelper

  schema "categories" do
    field :name, :string
    field :order, :integer
    field :slug, :string

    belongs_to :parent, Category, [foreign_key: :pid]
    has_many :children, Category, [foreign_key: :pid]
    has_many :articles, Article, [foreign_key: :category_id]

    timestamps()
  end

  @doc false
  def changeset(%Category{} = category, attrs) do
    category
    |> cast(attrs, [:pid, :slug, :name, :order])
    |> ChangesetHelper.normalize_slug(:slug)
    |> validate_required([:slug, :name, :order])
    |> ChangesetHelper.validate_255([:slug, :name])
    |> unique_constraint(:slug)
  end
end

Thanks for the code. But where do you manage parent association ?
Ideally, my changeset function should handle changeset creation with or without parent struct. Using the same prototype.

Parent is the pid (same as your parent_id) field which references the same table on db level in migration:

add :pid, references(:categories, on_delete: :delete_all)

I just pass it (an integer) when creating category so that the entry has a parent or not pass it so that it’s nil and the category is root. If I try to pass an invalid value I’d get an error.

The categories can then be created under a parent (with it’s id) or at the root level without it.

Indeed, you’re using the parent category id as parameter, that would be my 2nd solution.
But doesn’t it introduce some tight coupling between category pid and whatever uses it ?
Isn’t it better to use a Category struct and “hide” implementation details ?

For instance when seeding data I can do that:

CMgr.set_category_parent(CMgr.get_category!(%{"name" => "Business"}, CMgr.get_category!(%{"name" => "Electronics"})))

without referencing any (possibly auto) id, just using something like %{ parent: secondParam}

Am I being too OOP-oriented ? :slight_smile:

I understand your concern, not sure if it’s OOP related, I’d rather think of it as a conscious trade off. If you wanted to do the same with your seeds and id you’d do something along these lines:

category_data = %{parent_id: CMgr.get_category!(%{"name" => "Electronics"}).id, ...}
Repo.insert(Category, category_data)

or if you wish almost identical:

CMgr.set_category_parent_id(CMgr.get_category!(%{"name" => "Business"}, CMgr.get_category!(%{"name" => "Electronics"}).id))

Also if your your get_category! makes a DB request, your seeds will have a n+1 problem, I avoid it by truncating the tables on each seed:

Ecto.Adapters.SQL.query(Repo, "TRUNCATE categories RESTART IDENTITY CASCADE;")

and then I can seed fixed number of top level categories from the num_categories variable and use Enum.random(1..num_categories) in the pid field to seed some second level, I do that with a little randomization to get 2-3 level tree in the seeds.

So using integers is obviously somewhat less fancy but simple, which is ok in my opinion :slight_smile:

Thanks :smile:

This trade off is ok for me too, easier to manage.

1 Like