Delete my file before deleting the Ecto entry on association

augnustin · November 26, 2020, 9:51am

So I have this Resume module, which stores infos about my attached files, stored in my object storage. This module belongs_to the User module.

In my admin, I let my team delete Users, but if there is a Resume attached, it will fail.

I know I could set the on_delete: :delete_all option, which would delete all Resume DB entries, but it won’t delete those files in my object storage.

How can I make sure a specific method is called when Repo.delete(%User{}) is called?

It seems like Ecto.Model.Callbacks used to exist, but doesn’t any more, as well as on_delete: :fetch_and_delete.

But I can’t believe there is no solution for my case.

Thanks

sfusato · November 26, 2020, 11:14am

Have a look at Ecto.Multi or Repo.transaction. Sounds like what you’re looking for.

benwilson512 · November 26, 2020, 1:11pm

Ecto model callbacks were removed in no small part because people like to use them for this sort of situation, but they are woefully bad at handling this situation. Suppose a user has uploaded 100 files, and it takes ~200ms to delete each file on S3 (pretty normal S3 response time on a bad day). If you did the file deletion call to S3 in a callback deleting the user would take > 20 seconds which is a long time to hog a database connection and could easily end up timing out the transaction, resulting in inconsistencies between the database and S3.

What we do in these cases is:

Step 1) Mark the user as “deleted” via a deleted_at, allowing you to filter them out.
Step 2) Enqueue a job to actually go delete the user.
Step 3) This job goes through each file one at a time, deletes the record on S3, then deletes the database row. Doing this asynchronously allows for S3 outages or other issues along the way to be OK, since you can just re-run the job.
Step 4) At the end of the job, actually delete the user record.

This avoids hogging database connections, and takes on deleting external files in an idempotent way so that any errors along the way don’t result in problematic, inconsistent state.

As a final note:

But I can’t believe there is no solution for my case.

These sorts of remarks are… unhelpful. Everyone here is helping you for free, all of the code you’re using is free. Few things suck the fun out of helping people like exasperation that the solution didn’t fall into your lap.

LostKobrakai · November 26, 2020, 2:47pm

This is imo an especially important point. If things fail for this step you want people to be notified and then to be able to debug the issue.

augnustin · November 26, 2020, 4:31pm

Thanks @benwilson512 for your answer. Apologies if you felt like my tone sounded exasperated: I simply was surprised not to find any previous topic on this while it feels to me like a pretty common pattern.

That said, I have been using the before_destroy callbacks from rails and did experience how painful it could be. I would definitely not recommend it for a front user action for sure.

But in this case, to further the discussion, it is an admin one, so I know what’s happening, delay is not a big deal (admins don’t delete users every day), and not allowing a feature because it may take 20s+ sounds to me like an early optimization.

I’d have more been in favor of considering the feature devil, to use with care and warnings instead of completely removing it.

What you suggest would definitely work, but feels like it is way too much work for my use case. Especially the whole deleted_at scoping logic which has also brought me troubles in the past. And sending tasks to the background gives me less feedback on what’s going on, which I’m not comfortable with.

Regarding:

I can’t see how making the task background makes it more consistent: there still is a risk that something wrong happens between S3 deletion and DB deletion, which brings inconsistent sate.

In the end, I think I will go for a purging routine: loop on all the stored files, and check if they are stored in my DB, otherwise remove it. This is certainly not the most optimized solution, but at least it guaranties me state consistency between storage and DB and anyway, I’ll have to write it as my current data is already inconsistent.

Thanks, don’t hesitate to further the discussion, I promise there is no negative feeling in my writing.

al2o3cr · November 26, 2020, 4:53pm

Timing issues aside, the key mental shift to building this in Ecto is to stop thinking of Repo.delete(%User{}) as the same thing as “deleting a user”.

That deletes a row from the database, but there is more involved in performing the logical / domain action “delete this user” than that.

Your context should implement that - so for instance you’d call Users.delete(%User{}) and that function would handle both removing the database entries and cleaning up the files.

tfwright · November 26, 2020, 5:43pm

This is really well put, I agree one hundred percent.

It might feel like it is “extra work” at first, but it’s extra work you only have to do if your app has extra complexity, and it promotes a design that makes that complexity explicit, easy to see, test and debug instead of hiding it away as if a piece of core business logic is a mere side effect.

benwilson512 · November 26, 2020, 6:29pm

Yes I second @tfwright’s comment that this is a very key insight. In Ecto, the a particular ecto schema is NOT supposed to encode all workflows relevant to that schema. Callbacks are almost always about managing workflows, and it was decided that the right way to do work flows is dedicated functions that layout that workflow.

However I’m not really trying tor rehash that question, it is long since settled at least as far as Ecto is concerned. @tfwright’s comment highlights how to move forward: have your Accounts.admin_delete(user) function simply do what you want to do. That’s the way to encode workflows, not with callbacks.

@augnustin as for your concerns about jobs, I suppose that boils down to whether your app is already making use of background work and how comfortable you are with them. We have very solid reporting around job failures so, in the event that one occurs, it isn’t going to be missed and we can compensate accordingly. The advantage of the approach I articulated is that, regardless of whether the files are actually deleted yet or not, the system can carry on as if the user is actually deleted, which is nice. Then the “cleanup” so to speak can carry on asynchronously. Jobs create an easy to re-run operation. However if you aren’t currently using them I agree it isn’t worth it to add them just for this.

guaranties me state consistency between storage and DB and anyway,

To be clear, there is no possible way to guarantee state consistency since you can’t create a transaction between external storage and your database. This is the value of the “deleted_at” flag. It at least lets you know “the file that this record points to may or may not exist”. If your app is already written to not guarantee that the file actually exists for a row, then you’re good to go. I’m just pointing out that no matter which order you delete things in, storage first, or database first, there are scenarios where they get out of sync, and without a flag or job or w/e it’s up to your admins to remember to try again manually.