I have a slow query as my data has grown. I wanted to test joining before preloads to see if i could speed it up.
I did a test on some mock data, first with the join and the second without. It seems like the speed gain is minimal.
Are there any other benefits to joining first before preloads?
Is it more efficient for the databases memory usage if we join before preloading?
If I did some additional nested preloads would that start to affect things?
With join
iex(1)> query = from tc in Timecard, join: tc_d in Timecards.TimecardData, on: tc.id == tc_d.timecard_id, preload: [timecard_data: tc_d]
#Ecto.Query<from t0 in Ev2.Timecards.Timecard,
join: t1 in Ev2.Timecards.TimecardData, on: t0.id == t1.timecard_id,
preload: [timecard_data: t1]>
Has the following output:
iex(2)> Repo.all(query)
[debug] QUERY OK source="timecards" db=1242.5ms decode=760.5ms
SELECT t0."id", t0."date", ..., t0."offer_id", t0."timecard_setting_id", t0."inserted_at", t0."updated_at", t1."id", t1."approver_id", ... , t1."updated_at" FROM "timecards" AS t0 INNER JOIN "timecard_data" AS t1 ON t0."id" = t1."timecard_id" []
Which looks like 1243ms.
Without join
Whereas the following with no joins and super simple syntax:
iex(3)> q2 = from tc in Timecard, preload: :timecard_data
#Ecto.Query<from t in Ev2.Timecards.Timecard, preload: [:timecard_data]>
Has the following output:
iex(4)> Repo.all(q2)
[debug] QUERY OK source="timecards" db=321.0ms decode=116.9ms
SELECT t0."id", t0."date", t0."week_ending_date", t0."datetime_submitted", t0."auto_submitted?", t0."auto_generated?", t0."employee_id", t0."proxy_creator_id", t0."offer_id", t0."timecard_setting_id", t0."inserted_at", t0."updated_at" FROM "timecards" AS t0 []
[debug] QUERY OK source="timecard_data" db=965.8ms decode=322.6ms
SELECT t0."id", t0."approver_id", ... , t0."timecard_id" FROM "timecard_data" AS t0 WHERE (t0."timecard_id" = ANY($1)) ORDER BY t0."timecard_id" [[15716, 15693, ...]]
Which to me looks like a total of 321 + 965 = 1286ms
Thanks for any input!