Chaining changeset function calls causes error duplication. Shouldn't this be improved?

I was just trying to re-use the changeset between liveview handle_event/3 calls and stumbled on a from my pov a bit unexpected behavior regarding chaining changeset function calls.

It seems that validation errors are not set/reset consistently. There are at least the following two cases I think are not what I would expect:

A) When a changeset function is run with an invalid value, an error is added. When the changeset function is run again, another error is added. That error is equal to the previous one, a duplicate. When the changeset function is run yet again, there is no other error being added, the amount of duplicate errors is capped to two somehow.

B) When the changeset function is run with an invalid value, then with a valid one, the error is not cleared. And when it is run yet again with an invalid value, a duplicate is added.

Take this example schema with a changeset function

defmodule Foo do
  use Ecto.Schema
  import Ecto.Changeset

  @primary_key false
  embedded_schema do
    field :foo, :string
  end

  def changeset(data, attrs) do
    data
    |> cast(attrs, [:foo])
    |> validate_inclusion(:foo, ["foo"])
  end
end

When (for case A) I run the following:

%Foo{}
|> Foo.changeset(%{"foo" => "not foo"})
|> tap(&IO.inspect(%{changes: &1.changes, errors: &1.errors}, label: "Should now have one error on foo"))
|> Foo.changeset(%{"foo" => "not foo"})
|> tap(&IO.inspect(%{changes: &1.changes, errors: &1.errors}, label: "Should still have one error on foo?"))
|> Foo.changeset(%{"foo" => "not foo"})
|> tap(&IO.inspect(%{changes: &1.changes, errors: &1.errors}, label: "Should still have one error on foo?"))

I get this behavior

Should now have one error on foo: %{
  errors: [foo: {"is invalid", [validation: :inclusion, enum: ["foo"]]}],
  changes: %{foo: "not foo"}
}
Should still have one error on foo?: %{
  errors: [
    foo: {"is invalid", [validation: :inclusion, enum: ["foo"]]},
    foo: {"is invalid", [validation: :inclusion, enum: ["foo"]]}
  ],
  changes: %{foo: "not foo"}
}
Should still have one error on foo?: %{
  errors: [
    foo: {"is invalid", [validation: :inclusion, enum: ["foo"]]},
    foo: {"is invalid", [validation: :inclusion, enum: ["foo"]]}
  ],
  changes: %{foo: "not foo"}
}

When for case B) I run

%Foo{}
|> Foo.changeset(%{"foo" => "not foo"})
|> tap(&IO.inspect(%{changes: &1.changes, errors: &1.errors}, label: "Should now have one error on foo"))
|> Foo.changeset(%{"foo" => "foo"})
|> tap(&IO.inspect(%{changes: &1.changes, errors: &1.errors}, label: "Should now have no error on foo?"))
|> Foo.changeset(%{"foo" => "not foo"})
|> tap(&IO.inspect(%{changes: &1.changes, errors: &1.errors}, label: "Should now have one error on foo?"))

I get

Should now have one error on foo: %{
  errors: [foo: {"is invalid", [validation: :inclusion, enum: ["foo"]]}],
  changes: %{foo: "not foo"}
}
Should now have no error on foo?: %{
  errors: [foo: {"is invalid", [validation: :inclusion, enum: ["foo"]]}],
  changes: %{foo: "foo"}
}
Should now have one error on foo?: %{
  errors: [
    foo: {"is invalid", [validation: :inclusion, enum: ["foo"]]},
    foo: {"is invalid", [validation: :inclusion, enum: ["foo"]]}
  ],
  changes: %{foo: "not foo"}
}

After looking around here on the forum and on github I came across those related pages

The github issue even has commenter suggest a solution in the form of a helper function that is to be called in your changeset and removes any errors for fields that changed prior to running other validations.

Jose mentions in the ticket that it would be a breaking change to remove errors when a validation is run again.

It would seem to me though, that in the context of chaining/piping the changeset function (e.g. in a liveview where the changeset is assigned to the socket), it can be intuitive that the errors are somehow cleaned up along the way (much like the helper function in the ticket attempts to do).

Maybe it could be useful to adjust cast/3 somehow so that, given a changeset, the errors pertaining to the changed fields are cleared?

Is this a strange train of thought I am having? Am I (over) thinking this in a (maybe completely wrong) way? :sweat_smile:

The way I think of changesets is they represent a single action be performed at some point in the future. After that action has occurred—that is, as soon as the :action key is no longer nil—that changeset has served its purpose. I do agree that the act of validation is not on the same plane as insert/update/delete, but it does stay consistent with the idea that you are left with a data structure represents exactly what happened in a single point of time. I believe it’s a simpler way to think about things.

While not quite the same since it was objects, I remember feeling the opposite way in Rails when I learned that calling record.valid? would cause errors to be removed! It felt wrong in more ways than one but it’s the best counter example I have. Like, if we could explicitly call remove_errors, that would be one thing, but it’s really cheap to create a new changeset.

6 Likes

I think the viewpoint that errors should be removed is a “mutable state” mindset, which you might get from someone used to OOP. The idea being that there is one “blessed” object representing our record, and one “blessed” changeset representing whether it’s valid, and you have to mutate those objects accordingly.

This is not how things work in Elixir - we have immutability. Every time you change anything you get a new “thing” back. So each “thing” (here, each changeset) is no longer blessed.

If you want a new changeset, make a new changeset! It is not any more special than the last changeset, or the next changeset. Your validators are referentially transparent; if you run the same things through them, you get back an identical changeset!

The only thing which is blessed is the row in your database.

3 Likes

The problem is already, when does a “new” validation begin?

cs
|> validate_length(:a, is: 1)
|> validate_length(:b, min: 2, max: 5)

If you’d reset the errors on each validation, then you would never see any errors from the validation of :a…

I would agree though that it is sad that there is no reset_errors/1 provided by ecto, and that each user has to implement it on their own.

1 Like

Ever proposed it?

1 Like

That’s why I thought it might be feasible to do this in cast/4, one of the usual entry points for creating changesets, right when most changeset function based validation flows ought to start, at the top of the changeset function (often in a schema).

Cleaning errors there, would mean that one still has any freedom to layer any errors within the changeset as one may wish. It wouldn’t change how any validate_* functions are to be used or what would be the end result of the whole schema validation, right?

It seems that this way, it would only affect the case when a changeset is piped into the same changeset function again that created it (or another one that has overlapping field validations ofc), possibly improving ergonomics around changeset (re)-use. (I know … pedantically spoken it would not be literally re-using the changeset as we have immutability by default here :joy:)

This is not that simple. Changeset functions don’t just create errors. They also create validation, required statuses, prepare callbacks and so on. None of these link back to what code added them. Say you have a validation requiring a :name and a next one not requiring a :name. The intention could be both to no longer require a :name or to merge requirements and retain the required status.

In the end the changeset abstraction was never built for “restarting” the validation process. You can always start from scratch with MyData.changeset(changeset.data, new_params)

4 Likes

And that cast could be seen as entry point for a consistent validation and changeset outcome does not change that picture for you?

I mean, the proposed helper function from the linked ticket:

defp maybe_delete_previous_errors(%Ecto.Changeset{valid?: true} = changeset), do: changeset
defp maybe_delete_previous_errors(%Ecto.Changeset{changes: changes, errors: errors} = changeset) do
  changed_values_keys = Map.keys(changes)
  remaining_errors = Enum.reject(errors, fn {key, _} -> key in changed_values_keys end)
  %{changeset | errors: remaining_errors, valid?: remaining_errors == []}
end

when used after cast or directly before calling changeset/2 again, does result in an arguably bit more consistent end result (e.g. when taking the changeset on a liveview socket, and pass it into it’s changeset function again in handle_event/3). Though I think that valid? maybe shouldn’t be changed if there are any cases a changeset is ever meant to be invalid without errors.

I know, that all those function can also be called outside a changeset function in a schema module … they are all just functions after all.
But only cleaning up errors on the fields that cast determined to have changed given the provided data seems to be a good middle ground for me.

About your examples where you think it could be troublesome,

They also create validation, required statuses, prepare callbacks and so on

would you mind sharing some examples as I am not sure what you are referring to there?

As for

Say you have a validation requiring a :name and a next one not requiring a :name. The intention could be both to no longer require a :name or to merge requirements and retain the required status.

When those validations happen after the cleanup (maybe proposed to be handled by cast/4), this still would work, wouldn’t it?
Of course, having multiple changeset functions with overlapping field validations but different intent as to only handling a part of a schema, where both use cast, only the last error pertaining to a overlapping changed field would be kept.

On the other hand, cast explicitly allows for passing changesets and is mentioned in the docs as one of the usual entry point for creating changesets, so it still feels sound somehow. And then again there are also things like add_error that could be use to have validation of interdependent fields but maybe could give problems/inconsistencies using this approach (but only whenever the changeset function flow is to be called again on the changeset I guess)

But of course I could be missing some other obvious cases where this would be a problem…?

Regarding starting from scratch with MyData.changeset(changeset.data, new_params), it is what I usually do, and in liveview, it works.

Fiddling around with handling partial updates in separate events is what got me trying to pipe the previous changeset to the changeset function again (taking an embedded schema on mount or handle params, deriving a changeset, and then trying to get a new changeset with the new form params on handle_event change for the form in that particular case).
So basically the liveview would represent the ui for the latent attempt of the user to change the data. Piping the current changeset into the changeset function again to get a new changeset with the validation state matching the new form params only felt like another reasonable thing to try. I mean instead of creating a new changeset at every step by reaching into the changeset data and making sure the params are handled correctly. But this then lead to having duplicated errors in the form, and inconsistent ones at that (for example why is it topping out at 2 errors per field for a given validation?). And no way to clear them by providing valid input, which lead to the OP.

No, just came to me while reading (and commenting on) this thread… :wink:

1 Like

What’s wrong with simply re-creating it?

This allows you to compose changesets as you build them up.

I wouldn’t be opposed to an explicit remove_errors in theory, but I’ll say again that I don’t quite understand the pain point of just recreating them from scratch each time. Just with the amount of discussion here, that is so much easier to reason about.

There’s also what @garrison was getting at: if you keep working on a changeset, you’re just fuelling the fire of “always working on stale data” that is a general problem on the web.

I’m not saying I’m right here, but just curious as I’ve never felt this pain and having trouble understanding the real pain point here.

EDIT: wouldN’T be opposed to an explicit remove_errors

3 Likes

No, because it’s not an entry point. You could be calling cast/4 multiple times, you could never call it – e.g. I often use change/2 with internal data. I’ve even used a combination of both.

https://hexdocs.pm/ecto/3.12.5/Ecto.Changeset.html#validations/1

Opposite to other validations, calling this function does not store the validation under the changeset.validations key. Instead, it stores all required fields under changeset.required .

https://hexdocs.pm/ecto/3.12.5/Ecto.Changeset.html#validate_required/3

I’ve written a whole blog post about this at length: One-to-Many LiveView Form | Benjamin Milde

I don’t think it makes sense to shoehorn this usecase into changesets tbh. For LV I think we’d want a completely new format implementing Phoenix.HTML.FormData in a way where updating changes piecemeal and over time is an explicit feature. Heck it could even just “merge” params known with new sets of params and in the background call the changeset function again with the existing data and the new map of params.

my_changeset_function(
  last_changeset.data, 
  Map.merge(last_changeset.params, to_change)
)

The tricky bit is at the params level you also need to deal with invalid data (like “abc” coming in on an integer field) and we really want a more high level API for adjusting the “state” of the form.

3 Likes

@sodapopcan, @garrison, @NobbZ, @BartOtten, and @LostKobrakai, thank you all for chiming in! Quite validating (excuse for the pun) and insightful to hear your opinions on this one!

I really liked to hear your viewpoints, and I agree that constructing a new changeset over and over again explicitly is not that big of a deal (and is how I normally would do it), though I think a few lines could be saved, and it still feels like just another transformation that otherwise would happen somewhere else. Then again, there are a lot of complexities that would have to be considered and be taken into account, and in the end maybe not worth the hassle.

So thank you again for your time and thoughts :slight_smile:

And with that, I hereby announce that I will return to my cave :smile:

4 Likes