Using Repo inside schemas

tjdam · April 24, 2020, 3:36am

I noticed I was repeating a lot of code by checking the repo for an existing entry before doing update operations, so instead I created this “update_changeset” that starts with a case instruction to look the repo for that entry, return it if it does or return an empty struct if it doesn’t and pipe this to the “standard” changeset that casts and validates the inputs:

def update_changeset(attrs) do
  case Repo.get_by(__MODULE__, attrs.id) do
    nil -> <do something>
   thing -> thing
  end
  |> changeset(attrs)
end

I would like to know if there’s any objection to do it like this. I’m asking because every example I saw before, people do that check in functions outside the schema. Bringing it to the a changeset seemed to me like an obvious solution to no repeat this code too often but because I’ve never seen it before I’m thinking that there might be something bad about it…

Thanks!

polygonpusher · April 24, 2020, 3:55am

I think you only end up hiding this logic as a result and its really the responsibility of the controller to first figure out if the entity in question is even real. Also there is a distinction between Changeset and Schema. In your example you are in the context of the changeset not the schema, though they both happen to typically live in the same module.

IMO you are better off doing the look ups closer at the boundaries and failing there so that the code is closer in proximity to the routers to which they relate. The changeset on the other hand assumes you are only ever changing a thing. It really should not be concerned with finding that thing too. You only end up having to bubble back up the response to the controller any way given it was not found which creates more abstractions.

Personally I’m not a fan, but to each their own.

tjdam · April 24, 2020, 5:16am

This is a great point, I’m still thinking of things in terms of files and modules as opposed to a more abstract idea of context, thanks for pointing it out!

My reasoning is that this kinda fits in the casting/validating part of a changeset. The fact this check conforms seeks to conform to a particular standard giving you one thing if the entry exists and another (a blank slate) if it doesn’t is a change in the sense that a cast is, no?

polygonpusher · April 24, 2020, 6:05am

Think in terms of single concerns. How many concerns does the controller have?

Is this request correct? What validates this request as correct?

Now, think in terms of the changeset.

Is this changeset correct, what validates this changeset as correct?

If you make looking up the record in question you rob the controller of probably the only thing its good for and move that concern to something that is already concerned with something else.

domvas · April 24, 2020, 8:14am

Let’s say your schema is User
You can split for functions in 2:

# in user.ex
def update_changeset(struct, params) do
  changeset(struct, params)
end

# in another module
def save(%User{} = user, params) do
  case Repo.get_by(User, params.id) do
    nil -> <do something>
    thing -> thing
  end
  |> User.changeset(params)
  |> Repo.update()
end

Then you have everything:

a clean separation of concerns (schema module is just about schema and data validation + a module dedicated to database operation)
you don’t repeat yourself as the new module will be call by any module/controller who wants to update a User

al2o3cr · April 24, 2020, 3:51pm

Most applications I’ve worked on wouldn’t find this useful, because before they can update a record they need to check that the current user is authorized to do so - and that usually requires loading the record from the database.

polygonpusher · April 24, 2020, 7:33pm

This is why it’s useful to keep this logic closer to your boundary. This is really the main issue I take with that approach, not in that trying to abstract the solution is an issue but where it happens. My mental model would have me think to look in the controller first because it’s really the gate keeper of the request. I would assume by the time I’m in deeper in the stack that the question of if the request was valid should have already been established.

polygonpusher · April 24, 2020, 7:37pm

But again with this you still have to bubble back up the results to the place that really cares about this and that is the controller, so why not just do it there first rather than hide this logic deeper in the stact? Its not like the controller is overloaded if you are using context correctly any how.

domvas · April 25, 2020, 8:13am

I think we all agree but that using User as module name misleads you on my intention:

schema -> data description and validation
context -> data persistence
authorization module: authorization process
controller -> calls authorization module to authorize user
-> calls context to persist user

But imo this also means that there is absolutely not a single Repo.? call from controller and yes, most of the time, controllers are very light.

djm · April 25, 2020, 9:12am

I noticed I was repeating a lot of code by checking the repo for an existing entry before doing update operations.

I don’t know enough about your specific situation to say whether it would be helpful but are you aware of PostgreSQL’s UPSERT syntax? If it could help here then Ecto has nice support for it. If you’re doing anything more complex than “update record if it exists, else create a new one” then it won’t fit - but if it does, then pushing this check to the database level can be quite a nice way to solve the problem.

tjdam · April 28, 2020, 2:00pm

I think I probably went this road cause the application I’m workin on is API only, no controllers. I’m using GraphQL/Absinthe, and It doesn’t seem like it’s that the resolvers should be dealing with this kind of logic… but you do have a good point.

So, ideally, only the context should have access to the repo?
I can actually see the benefit of that…
In your previous answer, that save function, where would you put it in a Phoenix application?

polygonpusher · April 28, 2020, 5:33pm

Ah ok, that’s different. With Absinthe it is better off moving that to the context and in which case the context becomes the boundary.

And yeah only the context should have access to the repo. But, there are cases where the data loader breaks some of these conventions. That’s the tradeoff. Most of all of this stuff is really just about thinking about the domain and boundaries. “What if I swap out ecto with something else?”, “what if I swap out Absinthe with something else?”. Should that change your apps domain and API?