Building a reordering library: Ecto.Multi vs Ecto.Changeset.prepare_changes?

Qqwy · April 30, 2017, 8:40pm

I have a question: I want to create a small library for Ecto that allows you to specify an ‘order’ column (type integer), which, together with some belongs_to relation determines the ordering of the ‘children’ in the relational database

Think order of cards in a card game, order of images in a photo gallery, order of songs in a playlist, etc.

The functionality I want to add is:

When creating a new item, add it to the end of its siblings.
When updating an item, if the ‘order’ (the actual column name should be specifyable of course) value has changed, update the other siblings accordingly, so the final order will stay correct.
(when the new order is lower, increment the order of all items between the new (inclusive) and the old order, when the new order is higher, decrement the order of all items between the old and the new (inclusive) order.
When deleting an item, ensure that no holes in the ordering appear

It seems to me that the proper place to add this behaviour is as functions that someone might call in their changeset pipelines.

Should I use Ecto.Multi for this? (And if so, how?)

Or should I use Ecto.Changeset.prepare_changes/2 or something similar?

Or something else entirely?

smpallen99 · April 30, 2017, 10:35pm

I wouldn’t try to do too much with prepare_changes. Its an older API. I talked with José a while back about prepare_changes vs Ecto.Multi and he recommended using Ecto.Multi for most cases.

In the ExAdmin ContactDemo project,
I proivde the ability to drag and drop categories in the index view.
I use a custom action to accomplish this. Not really what your asking for, but thought I would provide the example.

In the chat room application I’m building, I have a timestamp field with microsecond resolution which I use for sorting the order of posts. I have a function in my changeset that generates the timestamp for inserts. I know this does not satisfy your “no holes” requirement.

However, I would challenge that requirement. Do you really need no holes? Is that something required in your business domain? By using holes, you would reduce the need to update every row in the table on a order change. If you used the timestamp approach, you could easily pick create a new timestamp value half way between the two rows your inserting between. Given the microsecond resolution, this might provide lots of opportunity for single row updates. And if you do get a collision, you may be able to resolve it with a two row update.

I was going to provide an answer for the “no holes” solution, but then realized I don’t know if Repo.transaction is concurrency safe. Perhaps I’ll post that. I’ll update my answer soon.

Qqwy · May 1, 2017, 6:36am

Thank you, @smpallen99!

As for Repo.transaction, I believe it is concurrently safe (as long as the database adapter actually supports transactions, at least). I believe Postgres uses old-school pessimistic mutually exclusive locking to accomplish this.

In the ContactDemo project (thank you for the example!), it looks to me like you might be suffering from a N+1 queries problem, as a get and an update seems to be executed once per category.

As for the ‘no holes’: Until now, I was envisioning the order to be an integer column, for speed, conciseness, and because I am unsure how the user would otherwise indicate how to re-order something (That is, if I have a collection, and the user wants to move the seventh item to between the fourth and the fifth item, it makes sense to send reorder(7,5). I was envisioning this to be two queries:

One that increments/decrements all siblings’ orders between the old and new position.
One to insert/update the model itself, including its new order value.

The only rows that need to be updated are the ones that belong_to the same ‘parent’ object (i.e. only the photos in photo album #5), but of course this could still be many rows.

In your chat room application, do you ever reorder posts? Or are they fixed as soon as inserted?
Using timestamps with microsecond precision (or decimals with enough precision, which might be more appropriate) could work, but of course this takes a lot more room at some point the numbers will become too small and we’ll need to fix rounding problems by performing extra tricks.

As most applications (at least the one I am building) are read-heavy and not update-heavy, it seemed more appropriate to use a no-holes integer column.

seanwash · July 14, 2017, 5:21pm

Hey @Qqwy, have you made any progress on this? I’m looking to do pretty much the same thing!

Qqwy · July 14, 2017, 5:56pm

Hello @seanwash Yes, I made some progress on this.

To quickly answer the question I asked earlier: Ecto.Multi is the way to go.
I had quite a bit of discussion in the Elixir IRC group, because I tried to make this as perfect as possible. Turns out that because of the way transactions work in SQL, unless you set the transaction mode to ‘sequential’ (which means that your database slows down a lot), it was still possible for weird edge cases to happen.

So I am now using a different approach, which runs a group of three queries after one another that perform the required movement queries.

OrderedSiblings is not yet available in Hex (maybe I’ll have some time this weekend to smoothen the rough edges, but I cannot promise anything), but you can find the prerelease version on GitHub, if you want to check it out.

seanwash · July 14, 2017, 6:10pm

No rush on adding to hex unless you really want to. I’ll read through the code, I appreciate the response!