Mold - a tiny, zero-dependency parsing library for external payloads

Mold parses JSON APIs, webhooks, HTTP params and other external input into clean Elixir terms - coerces types, renames keys, checks structure, and returns {:ok, result} or {:error, errors} with traces.

Cheatsheet | Documentation | Hex

Mold.parse(:integer, "42")              #=> {:ok, 42}
Mold.parse(:date, "2024-01-02")         #=> {:ok, ~D[2024-01-02]}

Mold.parse(%{name: :string, age: :integer}, %{"name" => "Alice", "age" => "25"})
#=> {:ok, %{name: "Alice", age: 25}}

# Options refine the type
Mold.parse({:integer, min: 0, max: 100}, "50")              #=> {:ok, 50}
Mold.parse({:atom, in: [:draft, :published]}, "draft")       #=> {:ok, :draft}

# Any function works as a type
Mold.parse(&Version.parse/1, "1.0.0")
#=> {:ok, %Version{major: 1, minor: 0, patch: 0}}

# Errors include the path to the failing value
Mold.parse(%{items: [%{name: :string}]}, %{"items" => [%{"name" => "A"}, %{}]})
#=> {:error, [%Mold.Error{reason: {:missing_field, "name"}, trace: [:items, 1, :name], ...}]}

Types are plain data - atoms, tuples, maps, functions. No macros, no structs. A type is just a value you can build at runtime, store in a variable, or compose dynamically.

Mold follows the Parse, don’t validate approach. You parse at the boundary, and from that point on you work with clean Elixir terms. Mold handles structural correctness - business logic is a separate layer.

Also supports: source key mapping with propagation, union types, recursive types, and shared options (nilable, default, in, transform, validate) on all types.

Happy to answer any questions :slightly_smiling_face:

17 Likes

I’ve been having a few of these moments lately… I was literally just thinking about needing something like and came to Elixir Forum to procrastinate thinking about it.

1 Like

What’s the pitch for using this library over Ecto? Just more minimal for cases where you otherwise wouldn’t need Ecto? How about other dedicated parsing libs?

2 Likes

You might want instead of hardcoded opinionated coercers allow coercer: and validator: options (with proper defaults.)

You might find some inspiration in estructura on that matter.

What’s the pitch for using this library over Ecto?

Good and hard question :slight_smile:

Just more minimal for cases where you otherwise wouldn’t need Ecto?

Being minimal wasn’t a goal, it’s a side effect. The main issue is that Ecto simply can’t cover everything I need.

From my experience Ecto struggles when it meets the reality of marginal APIs. Ecto works well when you control the input data format - when you write your own API, you define the contract and can build around Ecto’s features and limitations.

But consider parsing responses from a third-party server. With Ecto you have two options: schemas or schemaless changesets. With schemas, every nested structure requires a new module with a struct definition + changeset functions. That’s a lot of ceremony. And when it comes to actually saving that data, you end up converting those structs back to maps anyway, because Ecto’s cast won’t accept a foreign struct. So what was the point? Structs aren’t that convenient in this context, especially when fields are selected dynamically and you need to distinguish between not requested fields and nil values. The schemaless approach is limited too - no nested maps/lists of maps. José has a prototype GitHub - josevalim/schemecto: Schemaless Ecto changesets with support for nesting and JSON Schemas · GitHub that covers nested structures, but I need more.

External APIs are not as stable as we’d like them to be and even (or that would be better to say especially? :laughing:) from huge corporations like Microsoft, I regularly get broken data (even invalid JSON!! but that’s nother topic). For example, their recipient type looks like this:

{"emailAddress": {"address": "user@example.com", "name": "Alice"}}

camelCase keys, unnecessary nesting - I don’t need that emailAddress wrapper. The address field might contain something that’s not an email at all, sometimes it’s empty! In Mold, I define a custom email type, flatten the nesting via source paths:

email = {:string, format: ~r/^[^\s@]+@[^\s@]+$/}

recipient = {:map, fields: [
  email: [type: email, source: ["emailAddress", "address"]],
  name: [type: :string, source: ["emailAddress", "name"]]
]}

Mold.parse(recipient, %{"emailAddress" => %{"address" => "user@example.com", "name" => "Alice"}})
# {:ok, %{email: "user@example.com", name: "Alice"}}

{:map, fields: [...]} here is the expanded form because field options like :source require it. For simple cases there’s a shortcut: %{name: :string, age: :integer} which is much more compact.

As you might have guessed, email above is a custom type :slight_smile: In Ecto, you’d create a module implementing the Ecto.Type behaviour with 4 required callbacks. When I think about parsing external data, I only care about cast. Why do I need dump/load that will never be used?

If I want to drop invalid records and to continue working with what is left, I can use reject_invalid:

recipients = {[recipient], reject_invalid: true}

Custom validations in Ecto are verbose. Here is an example from the docs, :title appears 3 times:

changeset = validate_change changeset, :title, fn :title, title ->
  if title == "foo" do
    [title: {"cannot be foo", additional: "info"}]
  else
    []
  end
end

In Mold, options live with the type. :integer is one type, {:integer, min: 0} is a more precise type, just like saying positive_integer. You don’t bolt validation rules onto a type after the fact, you refine the type itself. The validation defined above would be just {:string, validate: &(&1 != "foo")}.

Ecto also forces you to write an error reason every time, even when a generic “invalid value” would be enough, especially when the end user won’t see it anyway. Mold’s validate keeps simple things simple. When you do need a rich error reason, you’d have to write a function:

title = fn title ->
  with {:ok, title} <- Mold.parse(:string, title) do
    if title == "foo" do
      {:error, :banned_title}
    else
      {:ok, title}
    end
  end
end

I think about extending validate in the future to accept :ok | {:error, reason} when you need a custom reason, but for now a function-as-type covers anything.

Mold doesn’t build error messages for the end user, so you’d write your own formatter. Mold.Error is well-documented and that shouldn’t be hard.

How about other dedicated parsing libs?

Tarams — Tarams v1.8.0 - might look similar at first glance, but the API is different. It’s very map-centric - everything starts with a map, so if a response is just a list of items, you’d need to wrap it. Mold’s API is more compact and I’m not sure you can make it shorter, the syntax is close to Elixir typespecs:

# tarams
%{users: [type: {:array, %{name: [type: :string]}}]}

# mold
%{users: [%{name: :string}]}

# typespecs
%{users: [%{name: String.t()}]}

Also, tarams doesn’t support an :atom type, so if you want to validate an enum, you’re limited to strings. In Mold {:atom, in: [:draft, :published]} returns an atom.

Zoi — Zoi v0.17.4 - a relatively new library, Mold’s core was built before Zoi appeared. Zoi has a larger API surface. Zoi is built with Zod (TypeScript) in mind - it brings that mental model to Elixir. Mold was built with Elixir in mind. The API surface of Zoi inherits a lot from TypeScript’s type system - optional vs nullable vs nullish, union vs discriminated_union vs intersection.

Zoi expects atom keys by default - coerce: true on a map enables string keys. But string keys are what you get from JSON, HTTP params, and even Phoenix controllers. So in practice you need coerce: true almost everywhere. Mold defaults to string keys because that’s what external data looks like.

Compare:

# zoi
Zoi.map(%{
  name: Zoi.string(),
  age: Zoi.integer(coerce: true),
  address: Zoi.map(%{city: Zoi.string(), zip: Zoi.string()}, coerce: true),
  tags: Zoi.array(Zoi.string())
}, coerce: true)

# mold
%{
  name: :string,
  age: :integer,
  address: %{city: :string, zip: :string},
  tags: [:string]
}

Zoi seems to aim at being a central place for all schema definitions in a project here is post from the author Zoi - schema validation library inspired by Zod - #42 by phcurado. Mold is specifically for the periphery of your application - parsing external data at the boundary. That’s why Mold defaults to string keys and doesn’t accept both string and atom keys at the same time - no temptation to use it in domain logic where it doesn’t belong.


I wanted to write more, but I’m tired :sleeping_face: I hope it helps

12 Likes

Yes, you’re right, that’s opinionated. But the workaround is simple - just define a function that does parsing as you’d want :slight_smile:

yes_no = fn                                                                                                                                                                                                      
    "yes" -> {:ok, true}                                                                                                                                                                                         
    "no" -> {:ok, false}                                                                                                                                                                                           
    _ -> {:error, :invalid}                                                                                                                                                                                        
  end

iex> Mold.parse(yes_no, "yes")
{:ok, true}
iex> Mold.parse(yes_no, "true")
{:error, [%Mold.Error{reason: :invalid, value: "true", trace: nil}]}
3 Likes

hey @fuelen awesome library!

Thanks for mentioning Zoi, your assessment is correct. The reason I did not add coercion by default is because I prefer being explicit about what gets converted, so there are no hidden conversions. You can still coerce a whole structure by traversing the schema.
The main reason I built Zoi was to have good building blocks so others could build on top of it or just use it for schema definition and parsing. It contains the raw building blocks but tries not to be opinionated.

Nevertheless I can see value in a library like Mold which offers good defaults around this, so the user does not need to worry about how the data needs to be parsed, it will try to parse even if the data doesn’t come in the exact shape.

2 Likes

That’s a misunderstanding of what Ecto.Changeset.cast/4 is about though. cast is about taking in external data, which has not been typecast to internal datatypes. It’s the whole schema version of Ecto.Type.cast. When you got structs of your data you’re past that step. You’re no longer dealing with unknown input. You can just use Ecto.Changeset.change/2 or Ecto.Changeset.put_change/3 instead to skip any requirement of having a map or any typecasting – which is no longer necessary anyways.

Yes, I understand what cast is :slight_smile:
I mentioned this approach because this is the pattern I’ve seen in real codebases, so I wanted to highlight that this is not an optimal way to solve the problem.

You can just use Ecto.Changeset.change/2

it will fail

defmodule Input do
  defstruct [:first_name]
end

{%{}, %{first_name: :string}}
|> Ecto.Changeset.change(%Input{first_name: "Artur"})
# ** (Protocol.UndefinedError) protocol Enumerable not implemented for Input (a struct)

# but this will work
{%{}, %{first_name: :string}}
|> Ecto.Changeset.change(%Input{first_name: "Artur"} |> Map.from_struct)

Hey @fuelen, I’ve just started using this a few days ago. It’s been a joy to use, thank you for your work on this lib!

1 Like