How to normalize function params

fceruti · November 25, 2021, 12:25pm

Hey everyone!

Saša Jurić on his talk called Clarity @ ElixirConf EU 2021, mentions a way of architecting your code that uses a function that he names normalize/2, but sadly, he says he doesn’t have time to share it.

I want to implement this programming style, but I’m missing this key ingredient. Do you know any libraries or gist that accomplish this? Thanks

def register(conn, params) do
  schema = [
    email: {:string, required: true},
    password: {:string, required: true}
    date_of_birth: :datetime,
    # ...
  ]

  with {:ok, params} <- normalize(params, schema),
    {:ok, user} <- MySystem.register(params) do
    # respond success
  else
  {:error, reason} ->
  # respond error
  end
end

update: if it’s not clear, I’m looking for a way of checking the existence and type of all the fields specified in schema.

vrcca · November 25, 2021, 12:31pm

I believe this is what you’re looking for: Towards Maintainable Elixir: The Core and the Interface | by Saša Jurić | Very Big Things | Medium

From my understanding, he just extracted the previous version of register into the normalize function.

stefanchrobot · November 25, 2021, 12:53pm

As mentioned in the article, the author is suggesting using Ecto’ schemaless changesets for this. I think the normalize function can be written in a pretty generic way.

fceruti · November 25, 2021, 3:59pm

Thank @vrcca & @stefanchrobot for your pointers. I was able to recreate this function. I’ll leave it here for anyone who may find this post later.

If you see anyway I can improve this code, please let me know

defmodule Timetask.Helpers.Normalizer do
  import Ecto.Changeset

  @doc """
  Normalizes and validates that `params` is formed according to `schema`.

  ## Examples

      iex> Timetask.Helpers.Normalizer.normalize(%{
      ...>     name: {:string, required: true},
      ...>     description: :string,
      ...>     count: {:integer, default: 10}
      ...>  }, %{name: "only required field"})
      {:ok, %{count: 10, name: "only required field"}}

      iex> normalized_params = Timetask.Helpers.Normalizer.normalize(%{
      ...>     name: {:string, required: true},
      ...>     description: :string,
      ...>     count: {:integer, default: 10}
      ...>  }, %{description: "has no name"})
      ...> {:error, %{errors: errors}} = normalized_params
      ...> assert Keyword.has_key?(errors, :name)

      iex> Timetask.Helpers.Normalizer.normalize(%{
      ...>     name: "I'm a string"
      ...>  }, %{name: "Use atoms of tuple"})
      ** (ArgumentError) Bad formed schema

  """
  @spec normalize(map, map) :: {:error, Ecto.Changeset.t()} | {:ok, map}
  def normalize(%{} = schema, %{} = params) do
    required_fields =
      Enum.reject(schema, fn {_, v} ->
        with true <- is_tuple(v),
             {_, opts} = v,
             true <- Keyword.has_key?(opts, :required) do
          false
        else
          _ -> true
        end
      end)
      |> Enum.map(fn {k, _} -> k end)

    defaults =
      schema
      |> Enum.filter(fn {_, v} ->
        with true <- is_tuple(v),
             {_, opts} = v,
             true <- Keyword.has_key?(opts, :default) do
          true
        else
          _ -> false
        end
      end)
      |> Enum.into(%{}, fn {k, {_, opts}} ->
        {k, Keyword.get(opts, :default)}
      end)

    normalized_schema =
      schema
      |> Enum.map(fn {k, v} ->
        cond do
          is_tuple(v) -> {k, elem(v, 0)}
          is_atom(v) -> {k, v}
          true -> raise(ArgumentError, "Bad formed schema")
        end
      end)
      |> Enum.into(%{})

    {defaults, normalized_schema}
    |> cast(params, Map.keys(normalized_schema))
    |> validate_required(required_fields)
    |> apply_action(:insert)
  end
end

sasajuric · November 25, 2021, 7:30pm

Yes, on both accounts. I can’t share the code (it’s owned by the clients), but the basic take is something like

def normalize(data, types) do
  {%{}, types}
  |> Ecto.Changeset.cast(data, Map.keys(types))
  |> Ecto.Changeset.apply_action(:insert)
end

This is probably not enough (e.g. you may want to handle nils and missing keys), but it’s a solid start.

stefanchrobot · November 25, 2021, 7:49pm

Just a few remarks if you don’t mind:

Enum.reject(schema, fn {_, v}

I’d go with something longer and less generic than k (key?) and v (value?). Maybe {field, schema}? Plus in this case I think it makes sense to use {_k, v} to explain the meaning of the first element.

cond do
  is_tuple(v) -> {k, elem(v, 0)}
  is_atom(v) -> {k, v}
  true -> raise(ArgumentError, "Bad formed schema")
end

Using more pattern matching would be more idiomatic:

case v do
  {type, _opts} when is_atom(type) -> {k, type}
  type when is_atom(type) -> {k, type}
  _ -> raise ArgumentError, "bad schema: #{inspect(v)}"
end

And one more thing:

schema |> Enum.map(fn {k, v} -> ... end) |> Enum.into(%{})

can be replaced with:

Map.new(schema, fn {k, v} -> ... end)

tomekowal · November 26, 2021, 8:31am

I wanted to reply on Twitter but I saw there is already a solution in here. I’d like to share my anyway
Since we need to traverse the schema many times, I normalize it first and then use list comprehensions like this:

  def parse(params, schema) do
    # First I want to have entire schema in one format [{key, {type, opts}}]
    normalized_schema = for {key, type_spec} <- schema, do: {key, apply_default_opts(type_spec)}
    keys = for {key, _} <- normalized_schema, do: key
    types = for {key, {type, _opts}} <- normalized_schema, into: %{}, do: {key, type}
    required_fields = for {key, {_type, opts}} <- normalized_schema, Keyword.get(opts, :required), do: key
    defaults = for {key, {_type, opts}} <- normalized_schema, default = Keyword.get(opts, :default), into: %{}, do: {key, default}

    {defaults, types}
    |> cast(params, keys)
    |> validate_required(required_fields)
    |> apply_action(:normalize)
  end

  @default_opts [required: false]
  defp apply_default_opts(type) when is_atom(type), do: {type, @default_opts}
  defp apply_default_opts({type, opts}), do: {type, Keyword.merge(@default_opts, opts)}

The line computing defaults might be tricky to understand because it is long and introduces a variable inside the comprehension filter.
Also, I’d vote for naming that function “parse” instead of normalize. Parsing is an action of taking a bunch of data and trying to transform it into a structure that is understandable. In the spirit of Parse, don’t validate

It would be possible to push the schema specification even more to introduce validations like:
email: {:string, format: ~r/@/}
and then call Changeset.validate_format(changeset, key, format) for each. But that would require another option for each Ecto.Changeset validatior. It might be an overkill but the schema format is very readable so it is tempting

fceruti · November 27, 2021, 12:15pm

Noted. Definitely it’s a little cryptic.

Oh yeah, the code looks much better.

I’m not convinced of this point. I’ve taken most of your ideas, but I kinda like normalize. To me parse means the data is going to change type.

Anyways, here is the revisited version with my take on @stefanchrobot and @tomekowal suggestions

defmodule Timetask.Helpers.Normalizer do
  import Ecto.Changeset

  @doc """
  Normalizes and validates that `params` is formed according to `schema`.

  ## Examples

      iex> Timetask.Helpers.Normalizer.normalize(%{
      ...>     name: {:string, required: true},
      ...>     description: :string,
      ...>     count: {:integer, default: 10}
      ...>  }, %{name: "only required field"})
      {:ok, %{count: 10, name: "only required field"}}

      iex> Timetask.Helpers.Normalizer.normalize(%{
      ...>     name: {:string, required: true},
      ...>     description: :string,
      ...>     count: {:integer, default: 10}
      ...>  }, %{"name" => "Also accepts strings as key"})
      {:ok, %{count: 10, name: "Also accepts strings as key"}}

      iex> normalized_params = Timetask.Helpers.Normalizer.normalize(%{
      ...>     name: {:string, required: true},
      ...>     description: :string,
      ...>     count: {:integer, default: 10}
      ...>  }, %{description: "has no name"})
      ...> {:error, %{errors: errors}} = normalized_params
      ...> assert Keyword.has_key?(errors, :name)

      iex> Timetask.Helpers.Normalizer.normalize(%{
      ...>     name: "I'm a string"
      ...>  }, %{name: "Use atom or tuple"})
      ** (ArgumentError) Bad formed schema

  """
  @spec normalize(map, map) :: {:error, Ecto.Changeset.t()} | {:ok, map}
  def normalize(%{} = schema, %{} = params) do
    normalized_schema =
      for {field_name, type_spec} <- schema,
          do: {field_name, apply_default_opts(type_spec)}

    defaults =
      for {field_name, {_type, opts}} <- normalized_schema,
          default = Keyword.get(opts, :default),
          into: %{},
          do: {field_name, default}

    types =
      for {field_name, {type, _opts}} <- normalized_schema, into: %{}, do: {field_name, type}

    normalized_params =
      for {param_name, param_value} <- params, into: %{}, do: {to_atom(param_name), param_value}

    fields = for {field_name, _} <- normalized_schema, do: field_name

    required_fields =
      for {field_name, {_type, opts}} <- normalized_schema,
          Keyword.get(opts, :required),
          do: field_name

    {defaults, types}
    |> cast(normalized_params, fields)
    |> validate_required(required_fields)
    |> apply_action(:normalize)
  end

  defp to_atom(name) when is_atom(name), do: name
  defp to_atom(name) when is_bitstring(name), do: String.to_atom(name)
  defp to_atom(_), do: raise(ArgumentError, "Bad formed schema")

  @default_opts [required: false]
  defp apply_default_opts(type) when is_atom(type), do: {type, @default_opts}
  defp apply_default_opts({type, opts}), do: {type, Keyword.merge(@default_opts, opts)}
  defp apply_default_opts(_), do: raise(ArgumentError, "Bad formed schema")
end

vrcca · November 28, 2021, 1:36pm

I’d change String.to_atom by String.to_existing_atom, since this data comes from untrusted source.

fceruti · November 28, 2021, 5:33pm

Good catch. I dindn’t know about that function

LostKobrakai · November 28, 2021, 5:35pm

I’d be intersted in how you handled the return value of the core, especially for errors. It might be easy if db schema and normalized schema are similar, but becomes tricky when that’s not the case.

riebeekn · November 28, 2021, 6:11pm

I’ve not used it myself, but the tarams library looks like it might handle your use case if you want to go with a library instead of a custom solution GitHub - bluzky/tarams: Casting and validating external data and request parameters in Elixir and Phoenix

fceruti · November 28, 2021, 9:25pm

Thanks! This seems to be exactly what I was looking for. I’ll test it and possibly mark it as the solution. The quest of building/understanding this was an interesting one thou.

ityonemo · November 28, 2021, 11:34pm

if you want to use something more standards-compliant, I have this library which does compile-time generation of strictly validation functions from Jsonschemas:

https://hexdocs.pm/exonerate/Exonerate.html

It doesn’t normalize your parameters, though, so if you need a translation layer between things with different “names” (e.g. camelCase → snake_case) you might have to cook up something on your own. We do this at work with a “Codec” module which I may do a webcast on sometime.

With Ecto, though, you can just peddle in strings and it figures out the pesky strings/atoms stuff for you.

sasajuric · November 29, 2021, 12:59pm

IMO, the thing we’re validating is the input itself. In the simplest case (which is in my experience also the most frequent one), we can do most of the validations before even hitting the database, so that’s simple. Occasionally I may need to hit the database to validate some constraint (most often uniqueness). In most cases I’ve had this is also simple, since the field being validated usually directly corresponds to the input field. Combining these two, in most cases all I needed was a single validation+store changeset where all possible field errors corresponded to the input fields.

I can’t recall a single situation where this didn’t fit the bill, but vaguely speaking the options I’d consider in such cases would be:

Separate input validation changeset from db store, performing most of validations on the input.
On db store error, take the errors, change the keys if needed (e.g. replace db field :foo with input field :bar).

It would help if you had some specific situation in mind, then we could discuss it.

IvanR · November 29, 2021, 3:01pm

If you are up to a declarative approach and dependency inversion towards your application core’s data types, you can use Domo library. That generates validators and constructor functions from a t() type spec of the struct, and the type spec is validated for syntax correctness by elixir during the compilation.

So the request can be accepted into the struct that looks like the following:

defmodule Request do
  use Domo, ensure_struct_defaults: false

  defstruct [:email, :password, :date_of_birth]

  @type t :: %__MODULE__{
    email: email(),
    password: password(),
    date_of_birth: Date.t() | nil
  }

  @type email :: String.t()
  precond email: &String.match?(&1, ~r|.+\@.+\..+|)

  @type password :: String.t()
  precond password: &String.length(&1) > 7

  # Domo adds new/1, new!/1, ensure_type/1, ensure_type!/1 here automatically
end

And the validation can be done like that:

iex(1)> Request.new(%{email: "user@test.com", password: "some_password"})
{:ok, %Request{date_of_birth: nil, email: "user@test.com", password: "some_password"}}

iex(2)> Request.new(%{email: "usertestcom", date_of_birth: "none"})      
{:error,
 [
   password: "Invalid value nil for field :password of %Request{}. Expected the value matching the <<_::_*8>> type.",
   email: "Invalid value \"usertestcom\" for field :email of %Request{}. Expected the value matching the <<_::_*8>> type. And a true value from the precondition function \"&String.match?(&1, ~r|.+\\\\@.+\\\\..+|)\" defined for Request.email() type.",
   date_of_birth: "Invalid value \"none\" for field :date_of_birth of %Request{}. Expected the value matching the %Date{} | nil type."
 ]}

The dependency inversion can be done by sharing email() and password() in the shared module to have the same rules for emails and passwords across the whole app.

Domo plays nicely with Ecto schemas because they are structs too. See Domo.Changeset for this kind of integration.