Ecto has_many. Read/Write at once vs using recrusive functions?

ndac_todoroki · March 21, 2018, 5:57pm

Hi. I’ve been creating several Ecto projects, and this question always come up. I couldn’t decide on my own, so I would like to hear from many of you.

Say here I have a Postgres database, several tables. There’s a many_to_many like this:

[Student] N::1 [ClassStudent] 1::M [Class]

If you have a known Student and a known Class already, there are ways to add a student to a class.
One will be using put_assoc.

# Approach 1.1
def add_student(%Class{} = class, %Student{} = student) do
  class
  |> Repo.preload(:students)
  |> Class.update_changeset()
  |> put_assoc(:students, [student | class.students])
  |> Repo.update()
end

you can create a join class directly as well (since we luckily have a join table schema defined this time);

# Approach 1.2
def add_student(%Class{} = class, %Student{} = student) do
  %ClassStudent{}
  |> ClassStudent.create_changeset(%{class: class, student: student})
  |> Repo.insert()
end

Now the problem is, what approach you should take when adding multiple associations.
Using either Approach 1.1 or Approach 1.2, you can do

# Approach 2.1
def add_student/2  # predefined

def add_students(%Class{} = class, [%Student{} = student]),
  do: add_student(class, student)

def add_students(%Class{} = class, [%Student{} = student | students]) do
  add_student(class, student)
  add_students(class, students)
end

There is another approach using put_assoc:

# Approach 2.2
def add_students(%Class{} = class, students) when student |> is_list do
  class
  |> Repo.preload(:students)
  |> Class.update_changeset()
  |> put_assoc(:students, class.students ++ students)
  |> Repo.update()
end

The same thing can be discussed when when getting nested associated data.

Class N::M Student N::M ExamResults

When I want all exam results from all students of a class (for maybe calculating averages of each student or so),

# Approach 3.1
def get_students_with_results(%Class{} = class) do
  class
  |> Repo.preload([:students, :exam_results])
  |> assoc(:students)
  |> Repo.all()
end

# other places
class
|> get_students_with_results()  # Calls Ecto
|> Enum.inject(fn student ->
     {
       student.name, 
       students.exam_results
         |> Enum.map(&Map.get(&1, :point, 0))
         |> Enum.sum
     })
|> Enum.into(%{})

# Approach 3.2
class
|> list_students()  # Calls Ecto
|> Enum.map(fn student -> {student, student |> list_exam_results()} end)  # Calls Ecto
|> Enum.map(fn {student, result_list} ->
    {student |> Student.get_name, result_list |> Enum.map(&ExamResult.get_point/1)}
   end)
|> Enum.into(%{})

Both will return a map %{String.t => non_neg_integer}

I think Approach 2.1 and 3.2 are more functional-ish keeping each data compact and basic, although 2.2 and 3.1 calls Postgres more efficiently as it JOINs or passes ids in a list form only once.

In Rails, I would’ve definitely done the latter - because ActiveRecord passes objects and that allows includes and anything else everywhere. (And that’s why I hate Rails. I find joining and including and preloading and etc… everywhere on our project!)
Because Ecto will pass data only, in Phoenix I will be creating functions for each use to do the latter. But then I think I will end up writing many many functions that will suit each possible use cases with different preloads, and that seems a horrible mess (Think of having 50+ tables). This is why I would like to do the former.

I’ve read somewhere that Ecto is super-fast, beyond comparison with ActiveRecords^{[citation needed]}, so I thought that maybe querying multiple times using recursive functions or Enum.maps doesn’t have much impact.

How do you think? Or should I take other ways? Any comment would help. Cheers!