Why do we call the Schema's changeset/2 function with empty attributes to render a new form?

From Programming Ecto’s book:

def new(conn, _params) do
  changeset = User.changeset(%User{}, %{})
  render(conn, changeset: changeset)

Why is the changeset/2 function of schema User called here to render a form the first time?
If I inspect the changeset, it will print something like:

  action: nil,
  changes: %{},
  errors: [
    name: {"can't be blank", [validation: :required]},
    age: {"can't be blank", [validation: :required]},
    # etcetera
  data: #MyApp.User<>,
  valid?: false

It looks like we tried to validate a changeset, but seems it doesn’t make sense as we didn’t submit anything yet, it was just the first render of the form.

=> These errors have then been computed (and maybe even translated) for nothing.

I thought then there should be a better way: maybe make a function in the context that just does Ecto.Changeset.change(%User{}) rather than calling the shema’s changeset/2 functions with all the rules and validations.

Maybe it’s for we could pre-populate the new form with some values like this?

def new(conn, _params) do
  changeset = User.changeset(%User{}, %{"langage" => "fr"})
  render(conn, changeset: changeset)

You can always use your preferred changeset to render the form. But as long as the action is set to ‘nil’ the errors won’t show on the page.
And also the time taken to populate and validate those fields is negligible.

The schema’s changeset will typically create a changeset via cast/4. cast/4 is used for casting/filtering external/untrusted data. A manually added default value as you have shown is trusted, thus change/2 can be used.

So I still don’t see the reason for using the schema’s changeset to add a default value, and going through all the validations. You can just add it through change/2 and add the default values in second argument.

I don’t worry about performance. It’s just that it makes little sense to me to do it that way, even though it seems the common way for codebases.

I was thinking about some optional data we want the user to see in the form but that he can replace or empty before submitting the form: then we can not really trust its content at the end. It’s not really a default value like the one to set with change/2, just a suggested value.

I think the purpose is probably to point out that we can do something like that. Of course, one could default the attrs param to %{} to avoid passing it each time.

The point here is mostly that it’s easier to build the form integration based on just a single type of data - the changeset - instead of needing to differentiate in every second line between an empty form and one where data was submitted, but there were errors or a form, which already has values, which are meant to be edited. The performance aspect of checking validations for most changeset is probably neglectable, but you could also just do Ecto.Changeset.change(%User{}) if you want to skip validation.


I’ll share with you why I want to differentiate the different changeset functions according to their use (render empty form vs new submitted data vs update data).

I have schema’s for which I need to compute data before inserting. If I just code a typical changeset/2 function:

def changeset(foo, attrs) do
  |> cast(attrs, @required_fields ++ @optional_fields)
  |> validate_required(@required_fields)

it will not be sufficient for inserting data, because some fields need to be computed through the use of other schema’s and the repo, and these values need to be added into the changes. Of course I can do that in the context:

|> Foo.changeset(attrs)
|> add_some_computed_value_in_the_changeset()
|> add_some_other_computed_value_in_the_changeset()
|> Repo.insert()

However I don’t like this solution too much, because another developer might look at the changeset/2 function inside the Foo schema and think that’s all he needs to insert data:

|> Foo.changeset(attrs)
|> Repo.insert()

The above will fail. So I prefer to have a more clear API:

def changeset_for_creating(
    ) do
  |> cast(attrs, @required_fields ++ @optional_fields)
  |> validate_required(@required_fields)
  |> fun_add_some_computed_value_in_the_changeset.()
  |> fun_add_some_other_computed_value_in_the_changeset.()

Here I require the developer to pass two functions, to make it clear that foo needs these two computed values for inserting a new foo in db. These argument names are very descriptive and documenting. It seems useless, but the API is more clear now and the application becomes more easy to understand than having to browse through the context to understand what’s happening. And to keep consistent in my application, I then avoid the changeset/2 function and favor more specialized functions as seen above.

This is a perfectly fine solution, and quite common in larger applications. I tend to simplify the names and have SomeModule.new() which returns the changeset for a new value, and then I keep changeset for changes. Same principle though.

Why is the changeset/2 function of schema User called here to render a form the first time?

  • Ecto.Changeset implements the Phoenix.HTML.FormData protocol (This is called attention to on p.68 of the beta 9 release of Programming Phoenix ≥ 1.4 - and also at the bottom of the page you reference (p.132) in Programming Ecto).
  • To be able to “serve” that protocol the changeset has to be tied to the schema structure (in your case #MyApp.User<>).

So I prefer to have a more clear API


refers to the code from that section (p.132) of Programming Ecto

# Accounts
def create_user(attrs \\ %{}) do
  |> User.changeset(attrs)
  |> Repo.insert()

is supposed to be that clear API.

Also have a look at what Repo.insert does:

  defp do_insert(repo, name, %Changeset{valid?: true} = changeset, opts) do
    # ...

  defp do_insert(repo, _name, %Changeset{valid?: false} = changeset, opts) do
    {:error, put_repo_and_action(changeset, :insert, repo, opts)}

By extension functions that run after validation like your fun_add_some_computed_value_in_the_changeset and fun_add_some_other_computed_value_in_the_changeset should have two clauses: one for %Changeset{valid?: true} to do the actual processing and another for %Changeset{valid?: false} which simply returns the original (invalid) changeset.

That is Railway Oriented Programming (ROP) as practiced with the Elixir pipe operator.

1 Like