We have two schemas:
Feed. They have a many to many relationship between them. When creating a Batch it will often be associated with thousands of feeds. We’re using this code to create the Batch and associate it with existing feeds:
feeds = Feed |> where([f], f.feed_id in ^feed_ids) |> Repo.all() changeset(batch_id_prefix, window) |> put_assoc(:feeds, feeds)
We’re also doing all of this in Redshift using the redshift_ecto adapter.
Our issue is that the join records between the batch and the feeds are inserted into the database one by one. This takes a long time (~100ms per record) because Redshift isn’t optimised for single inserts. I assume Ecto does this to allow granular handling of primary key conflicts. Redshift doesn’t enforce database constraints including uniqueness of primary keys so we’re not getting any benefit from inserting these join records one by one.
Is there a way to tell Ecto to do a bulk insert here? If there is, I couldn’t find it by looking at the docs and code. Also, do you think this would be something useful to add to Ecto, or should we just handle the creation of the relationship manually with