Schema setup for two-way, many-to-many relationship with itself

For the tasks schema below, I’m trying to add linking tasks to one another with a many_to_many relationship joined through a task_links table.

schema "tasks" do
  field :title, :integer
  field :description, :string
  belongs_to :assignee, User
  timestamps()
end

The task_links table is setup as follows:

def change do
  create table(:task_links) do
    add :left_task_id, references(:tasks)
    add :right_task_id, references(:tasks)
    add :creator_id, references(:users)
    timestamps()
  end

  create constraint(:task_links, :left_less_than_right_check, check: "left_task_id < right_task_id")
  create unique_index(:task_links, [:left_task_id, :right_task_id])
end

The check constraint and unique index are used to ensure links aren’t duplicated.

My question is, what is the best way to alter the tasks schema (or my approach to linking tasks) so queries preloading the task_links association return a single list of linked tasks?

The issues is a task’s id can be in the :left_task_id or :right_task_id column of the task_links table. I’ve almost achieved my goal with two many-to-many relationships:

schema "tasks" do
  ...
  many_to_many :right_links, Task, join_through: TaskLinks, join_keys: [left_task_id: :id, right_task_id: :id]
  many_to_many :left_links, Task, join_through: TaskLinks, join_keys: [right_task_id: :id, left_task_id: :id]
end

This setup is close to what I’m aiming for, but the query below returns links separately depending on whether the target task’s id was in the :left_task_id or right_task_id column of the task_links table:

def get_task_and_links(task_id) do
  from(t in Task,
    left_join: right in assoc(t, :right_links),
    left_join: left in assoc(t, :left_links),
    preload [right_links: right, left_links: left]
    where: t.id == ^task_id
  ) |> Repo.one
-----------------
iex(1)> Tasks.get_task_and_links(6)
%Task{
  id: 6,
  ...,
  left_links: [
    %Task{id: 4, ...},
    %Task{id: 5, ...}
  ],
  right_links: [
    %Task{id: 32, ...}
  ]

Is there a way I could set up my tasks schema (or change my method of linking tasks) so a query would return a list of every task linked to a task under a single field?

If I’m understanding you correctly, I would just designate one column in the task_links table as the one you care about and another column as the “linked” task. Then you just join on that one column. So I would have task_id and linked_task_id instead of left/right.

There’s a trade off though this will require duplicating data. It might be a big deal or it might not depending on your case. If you don’t want duplication then what you are doing now is the best way afaik.

1 Like

Inserting to these associations seems like it would be tricky, since one or the other won’t pass the constraint. IMO that’s a signal that they aren’t the right model.

+1 to @joey_the_snake’s suggestion - represent an undirected link between A and B as two directed links, A → B and B → A.

This will also come in handy when the customer who has ASSURED you that links are “absolutely positively definitely surely always bidirectional” chimes in with “sooooooooo, we asked the actual users and sometimes they aren’t” :joy:

2 Likes

“sooooooooo, we asked the actual users and sometimes they aren’t”

That’s a very good point :joy:! I’m sure users would appreciate directed links, A → B, so they could be used to represent a hierarchy between tasks (linking from parent task A to child tasks B, C, D…).

I’ll wait for customers to actually request that ability before I let them define A → B and B → A seperately, but I’ll set up the DB and schema as you and @joey_the_snake suggested.

I appreciate the help!