Hello.
I want to mass-dump orders in CSV reports. Simply put, when I do left_join
and preload
clauses in a single Ecto.Query
for nested associations, I get less results compared to when I only preload direct associations. My test is with offset of 0 and limit of 1000. Tinkering with disabling various nested associations changes the number of received records anywhere from 10 to 300. Simplified example follows below.
Ignore the anti-pattern of using offset and limit for now, please. I am asking my question here because I can’t figure out how to properly use Repo.stream
to mass-encode orders as CSV rows but that takes a backseat for now. The problem at hand is that my first simplified code (that doesn’t use Repo.stream
) can’t even rely on receiving 1000 records per batch (all except the last one, of course).
(NOTE: I am using PostgreSQL 9.6)
defmodule Context.Order do
def eager(query) do
from (x in query,
left_join: user in assoc(x, :user),
left_join: verification_details in assoc(user, :verification_details)
preload: [
user: {user, verification_details: verification_details}
]
)
end
def crippled_eager(query) do
from (x in query,
left_join: user in assoc(x, :user),
preload: [
user: user
]
)
end
def payment_captured(query) do
from(x in query, where: x.payment_state in ^~w(paid credit_owed balance_due))
end
end
The structure is extremely boringly classic: Order
has an user_id
column and VerificationDetail
has an user_id
column as well, so it goes like this: an order has one user which in turn has many verification details. I want to preload all that with a single query and I followed the common wisdom. However, in iex:
iex> records1 = Order |> Context.Order.payment_captured |> Context.Order.eager |> offset(0) |> limit(1000) |> Repo.all
iex> IO.puts length(records1)
268
:ok
iex> records2 = Order |> Context.Order.payment_captured |> Context.Order.crippled_eager |> offset(0) |> limit(1000) |> Repo.all
iex> IO.puts length(records2)
1000
:ok
(One observation: since this Elixir app I am creating is aimed to be a partial clone of a Rails Spree mastodont, I can compare SQL queries between Ecto and ActiveRecord. I noticed that AR does inner joins on all immediate associations and left outer joins on all nested associations, while Ecto only does left outer joins.)
Any idea why the difference of record counts above might happen at all? I am out of ideas. Also, in the eager
query I commented out the preload part and it will still yield less than 1000 records (apparently the problem is the left_join
on a nested association?). So, really, what gives?
Any hints are greatly appreciated!
// Dimi.