Recently I created a migration that adds a new association to a table. So to ensure that all the data I already have(without that association) has that column correctly filled, I want to do a bulk update.
This is the query I have so far:
from(f in Folder,
join: w in assoc(f, :workspace),
where:
is_nil(f.workspace_id) == true and f.organization_id == w.organization_id and w.default == true,
update: [set: [workspace_id: w.id]]
)
So what I want to do, is update all my rows in the folder table, where its workspace_id column is nil. And I want to update it with the id from the matching row in the workspace table, where both have the same organization_id and the column default in the workspace table is set to true.
I have looked through the documentation and a bunch of posts but I don’t know what I’m missing.
I’m not sure either - the query you posted looks like it would do what you’re describing. What happens when you run it? Does Ecto generate the correct SQL?
shouldn’t we find a left_join instead of join on your code?
Also,assoc won’t work (I mean the join condition - f.workspace_id == w.id - will be false - because f.workspace_id are currently null) if you try to set the very same foreign key on which it relies on. You’ll need to explicit the join with:
from(
f in Folder,
join: w in Workspace,
on: f.organization_id == w.organization_id,
on: w.default,
where: is_nil(f.workspace_id),
update: [
set: [
workspace_id: w.id,
]
]
)
Thanks a lot for your answer. My query was wrong. But now if I do it that way, it won’t actually update the data I want.
Repo.update_all(from(
f in Folder,
left_join: w in Workspace,
on: f.organization_id == w.organization_id and w.default,
where: is_nil(f.workspace_id)), [set: [workspace_id: w.id]])
If I try it that way, it will complain that w does not exist and is being expanded to w()
Re-punctuating your query can help clarify the situation:
query =
from(
f in Folder,
left_join: w in Workspace,
on: f.organization_id == w.organization_id and w.default,
where: is_nil(f.workspace_id)
)
Repo.update_all(query, [set: [workspace_id: w.id]])
This makes the error message about w clearer: the binding is out-of-scope.
This is a situation where Ecto.Query.update helps:
query =
from(
f in Folder,
left_join: w in Workspace,
on: f.organization_id == w.organization_id and w.default,
where: is_nil(f.workspace_id),
update: [set: [workspace_id: w.id]]
)
Repo.update_all(query, [])
Thanks again for your answer. I got this error with your solution: ** (Ecto.QueryError) PostgreSQL supports only inner joins on update_all, got: left in query
I’ll study to see how I can make this query with inner join and post the solution here.
You can do the inner join on organization_id and add the condition and is_nil(f.workspace_id).
If I understand your requirement correctly then you actually want an inner join because you’re only updating when there is a match on organization_id.
al2o3cr, cmo, joey_the_snake, t0t0
Thanks a lot for all your help. The code bellow solved my problem:
query =
from(
f in Folder,
inner_join: w in Workspace,
on: f.organization_id == w.organization_id and w.default,
where: is_nil(f.workspace_id),
update: [set: [workspace_id: w.id]]
)
Repo.update_all(query, [])
I still have one doubt, is there any difference between where and on? Because the query I got back from Postgresql was the same for both.
There’s no major difference between where and on for inner joins - the DB might do different optimizations, but the resulting rows are ultimately the same.
This isn’t the case for left/outer joins. For instance, imagine you have a schema where Post has many Comments and each comment has an author_id. This query:
from(
p in Post,
left_join: c in assoc(p, :comments),
where: c.author_id = 123
)
will have a result row for every Post that has a comment with the specified author_id. Posts that don’t have a comment from that author will be filtered out.
from(
p in Post,
left_join: c in assoc(p, :comments),
on: c.author_id = 123
)
This produces an ON clause with an AND, as in your example. This produces at least row in the results for everyPost row; posts that don’t have comments by the specified author have all NULLs in the columns from comments.
Combining a left join, on and where can be useful to produce “nonexistence” queries. To get only posts that don’t have a comment from a specified author, you could use this query:
from(
p in Post,
left_join: c in assoc(p, :comments),
on: c.author_id = 123
where: is_nil(c.id)
)
I’m just a bit more than 5 years late to the party, but I wanted to share that ecto_sql 3.9.1 has been released two days ago. This version permits using outer joins (e.g. left_join) with Repo.update_all.