Hello guys, I stumbled upon this problem today and I was seeking some help…
I have this routine where I need to insert or update a bunch of records in the database in a transaction.
Right now, I’m using Ecto.Multi
with multiple steps for each operation.
Just as an example, the structure goes something like this:
- I have categories, posts and tags
- Each tag can have a color (red, blue, green) and a unique name
- Each post can have many tags (post_tags)
- Each category also can have many tags (category_tags)
- And finally, a post belongs to a category
Every time I add a new tag to a category, I can choose a name and a color. Then, this recently created tag will be propagated to every post inside the category. Now, tow things can happen:
- When a post does not have a tag with that name, the tag is created
- When a post already has a tag with that name, the tag is updated/overridden.
Pretty simple, right? So what I’m already doing and why it’s not working:
When adding a tag to a category, post_tags may not exist yet but posts already do (because a post can only be created belonging to a category).
Because of that, when I’m querying the post_tags that needs to be inserted or updated, I do a right_join
, which will return a list of PostTag
that exists with that name and need to be updated or (because of the right join), a list of PostTag
structs with empty (nil) values that I can patch and then insert (for that I use select_merge
)
Repo.all(from pt in PostTag,
right_join: p in Posts,
on: pt.post_id == p.id,
select_merge: %{color: ...})
Thing is, this query returns a list of PostTag
with #Ecto.Schema.Metadata<:loaded>
, and an struct that’s already loaded returns a changeset with #Ecto.Changeset<action: :update>
instead of #Ecto.Changeset<action: :insert>
. Because of that, when I call Ecto.Multi.insert_or_update/3
, I get an Ecto.NoPrimaryKeyValueError
message.
Since insert_or_update/2
uses the :state
field from Ecto.Schema.Metadata
to check if it should insert or update the struct, I can’t find a way to work around that.
Of course, the example I gave is overly simplistic, but the whole reason I’m trying to do this in one go is that otherwise, I’d have to hit the database at least a couple of times more to do the inserts and updates that I need and since this table will have a lot of records, being economic would help a lot.