Flint

acalejos · June 27, 2024, 3:15am

Hey all,

I originally made this for my project Merquery and decided to extract it to its own library. The README is below if you’re interested. It’s still very early in development, but I felt that it was far enough along to be worth sharing.

There are many reasons I made this instead of using any of the many existing libraries out there, but the main reasons were:

I wanted to take advatage of Ecto.Type, which lets you define how types are both cast and dumped, meaning you can have separate representations of the data between client and server, and the type servers as the adapter.
I wanted the API to look like Ecto so that it can look better coexisting with Ecto code, be converted to/from Flint ↔ Ecto fairly easily, and take advantage of all of Ecto’s APIs

All schemas defined using Flint are stlil Ecto schemas under the hood, and I’ve attempted to make the API as idiomatic to Ecto as I could (e.g. even though validation are colocated inline with the schema, they’re still enforced on the call to changeset, new reflection functions are added using the __schema__(:atom) API, etc.)

Repo:

README

Practical Ecto embedded_schemas for data validation, coercion, and manipulation.

Features

! Variants of Ecto field, embeds_one, and embeds_many macros to mark a field as required (see. Required Fields)
Colocated validations, so you can define common validations alongside field declarations (see Validations)
Adds Access implementation to all schemas
Adds Jason.Encoder implementation to all schemas
New Ecto.Schema Reflection Functions
- __schema__(:required) - Returns list of fields marked as required (from ! macros)
- __schema__(:validations) - Keyword mapping of fields to validations
Convenient generated function (changeset,new,new!,…) (see. Generated Functions)
Configurable Application-wide defaults for Ecto.Schema API (see. Config)

Installation

def deps do
  [
    {:flint, github: "acalejos/flint"}
  ]
end

Motivation

Flint is built on top of Ecto and is meant to provide good defaults for using embedded_schemas for use outside of a database.
It also adds a bevy of convenient features to the existing Ecto API to make writing schemas and validations much quicker.

Of course, since you’re using Ecto, you can use this for use as an ORM, but this is emphasizing the use of embedded_schemas as just more expressive and powerful maps while keeping compatibility with Ecto.Changeset, Ecto.Type, and all of the other benefits Ecto has to offer.

In particular, Flint focuses on making it more ergonomic to use embedded_schemas as a superset of Maps, so a Flint.Schema by default implements the Access behaviour and implements the Jason.Encoder protocol.

Flint also was made to leverage the distinction Ecto makes between the embedded representation of the schema and the dumped representation. This means that you can dictate how you want the Elixir-side representation to look, and then provide transformations
for how it should be dumped, which helps when you want the serialized representation to look different.

This is useful if you want to make changes in the server-side code without needing to change the client-side (or vice-versa). Or perhaps you want a mapped representation, where instead of an Ecto.Enum just converting its atom key to a string when dumped, it gets mapped to an integer, etc.

Usage

If you want to declare a schema with Flint, just use Flint within your module, and now you have access to Flint’s implementation of the
embedded_schema/1 macro. You can declare an embedded_schema within your module as you otherwise would with Ecto. Within the embedded_schema/1 block, you also have access to Flints implementations of embeds_one,embeds_one!,embeds_many, embeds_many!, field, and field!.

You can also use the shorthand notation, where you pass in your schema definition as an argument to the use/2 macro. Flint.__using__/1 also
accepts the following options which will be passed as module attributes to the Ecto embedded_schema. Refer to the Ecto.Schema docs for more about these options.

primary_key (default false)
schema_prefix (default nil)
schema_context (default nil)
timestamp_opts (default [type: :naive_datetime])

So these two are equivalent:

defmodule User do
  use Flint

  embedded_schema do
    field! :username, :string
    field! :password, :string, redacted: true
    field :nickname, :string
  end
end

is equivalent to:

defmodule User do
  use Flint, schema: [
    field!(:username, :string)
    field!(:password, :string, redacted: true)
    field(:nickname, :string)
  ]
end

If you’re starting with Flint and you know you will stick with it, the shorthand might make more sense. But if you want to be able to quickly
change between use Ecto.Schem and use Flint, or you’re converting some existing Ecto embedded_schemas to Flint, the latter might be
preferable.

Since a call to Flint’s embedded_schema or use Flint, schema: [] just creates an Ecto embedded_schema you can use them just as you would any other Ecto schemas. You can compose them, apply changesets to them, etc.

Required Fields

Flint adds the convenience bang (!) macros (embed_one!,embed_many!, field!) for field declarations within your struct to declare a field as required within its changeset function.

Flint schemas also have a new reflection function in addition to the normal Ecto reflection functions.

__schema__(:required) – Returns a list of all fields that were marked as required.

Field Validations

Basic Validations

Flint allows you to colocate schema definitions and validations.

defmodule Person do
  use Flint

  embedded_schema do
    field! :first_name, :string,  max: 10, min: 5
    field! :last_name, :string, min: 5, max: 10
    field :favorite_colors, {:array, :string}, subset_of: ["red", "blue", "green"]
    field! :age, :integer, greater_than: 0, less_than: 100
  end
end

Parameterized Validations

You can even parameterize the options passed to the validations:

defmodule Person do
  use Flint

  embedded_schema do
    field! :first_name, :string,  max: 10, min: 5
    field! :last_name, :string, min: 5, max: 10
    field :favorite_colors, {:array, :string}, subset_of: ["red", "blue", "green"]
    field! :age, :integer, greater_than: 0, less_than: max_age
  end
end

If you do this, make sure to pass the options as a keyword list into the call to changeset:

Person.changeset(
  %Person{},
  %{first_name: "Bob", last_name: "Smith", favorite_colors: ["red", "blue", "pink"], age: 101},
  [max_age: 100]
)

#Ecto.Changeset<
  action: nil,
  changes: %{
    age: 101,
    first_name: "Bob",
    last_name: "Smith",
    favorite_colors: ["red", "blue", "pink"]
  },
  errors: [
    first_name: {"should be at least %{count} character(s)",
     [count: 5, validation: :length, kind: :min, type: :string]},
    favorite_colors: {"has an invalid entry", [validation: :subset, enum: ["red", "blue", "green"]]},
    age: {"must be less than %{number}", [validation: :number, kind: :less_than, number: 100]}
  ],
  data: #Person<>,
  valid?: false,
  ...
>

This lets you change the parameters of the validations for each call to changeset for more flexibility

Options

Currently, the options / validations supported out of the box with Flint are all based on validation functions
defined in Ecto.Changeset:

:greater_than (see. Ecto.Changeset.validate_number/3)
:less_than (see. Ecto.Changeset.validate_number/3)
:less_than_or_equal_to (see. Ecto.Changeset.validate_number/3)
:greater_than_or_equal_to (see. Ecto.Changeset.validate_number/3)
:equal_to (see. Ecto.Changeset.validate_number/3)
:not_equal_to (see. Ecto.Changeset.validate_number/3)
:format (see. Ecto.Changeset.validate_format/4)
:subset_of (see. Ecto.Changeset.validate_subset/4)
:in (see. Ecto.Changeset.validate_inlusion/4)
:not_in (see. Ecto.Changeset.validate_exclusion/4)
:is (see. Ecto.Changeset.validate_length/3)
:min (see. Ecto.Changeset.validate_length/3)
:max (see. Ecto.Changeset.validate_length/3)
:count (see. Ecto.Changeset.validate_length/3)

Aliases

If you don’t like the name of an option, you can provide a compile-time list of aliases to map new option names to existing options.

In your config, add an :aliases key with a Keyword value, where each key is the new alias, and the value is an existing option name.

For example, these are default aliases implemented in Flint:

config Flint, aliases: [
    lt: :less_than,
    gt: :greater_than,
    le: :less_than_or_equal_to,
    ge: :greater_than_or_equal_to,
    eq: :equal_to,
    neq: :not_equal_to
  ]

NOTE If you add your own aliases and want to keep these above defaults, you will have to add them manually to your aliases.

`schema(:validations)`

Since validations are enforced through the generated changeset functions, if you override this function you will not get the benefits
of the validations.

If you want to implement your own, you can use __schema__(:validations) which is an added reflection function that stores validations.

NOTE These are stored as their quoted representation to support passing bindings, so make sure to account for this if implementing yourself.

If you want to override changeset but want to keep the default validation behavior, there is also the Flint.Schema.validate_fields function,
which accepts an %Ecto.Changetset{} and optionally bindings, and performs validations using the information stored in __schema__(:validations).

Generated Functions

Flint provides default implementations for the following functions for any schema declaration. Each of these is overridable.

changeset - Creates a changeset by casting all fields and validating all that were marked as required. If a :default key is provided for a field, then any use of a bang (!) declaration will essentially be ignored since the cast will fall back to the default before any valdiations are performed.
new - Creates a new changeset from the empty module struct and applies the changes (regardless of whether the changeset was valid).
new! - Same as new, except raises if the changeset is not valid.

Config

You can configure the default options set by Flint.

embeds_one: The default arguments when using embeds_one. Defaults to [defaults_to_struct: true, on_replace: :delete]
embeds_one!: The default arguments when using embeds_one!. Defaults to [on_replace: :delete]
embeds_many: The default arguments when using embeds_many or embeds_many!. Defaults to [on_replace: :delete]
embeds_many!: The default arguments when using embeds_many!. Defaults to [on_replace: :delete]
:enum: The default arguments for an Ecto.Enum field. Defaults to [embed_as: :dumped].
:aliases: See Aliases

You can also configure any aliases you want to use for schema validations.

Embedded vs Dumped Representations

Flint takes advantage of the distinction Ecto makes between an embedded_schema’s embedded and dumped representations.

For example, by default in Flint, Ecto.Enums that are Keyword (rather than just lists of atoms) will have their keys
be the embedded representation, and will have the values be the dumped representation.

defmodule Book do
  use Flint, schema: [
    field(:genre, Ecto.Enum, values: [biography: 0, science_fiction: 1, fantasy: 2, mystery: 3])
  ]
end

book = Book.new(%{genre: "biography"})
# %Book{genre: :biography}

Flint.Schema.dump(book)
# %{genre: 0}

In this example, you can see how you can share multiple representations of the same data using this distinction.

You can also implement your own Ecto.Type and further customize this:

defmodule ContentType do
  use Ecto.Type
  def type, do: :atom

  def cast("application/json"), do: {:ok, :json}

  def cast(_), do: :error
  def load(_), do: :error

  def dump(:json), do: {:ok, "application/json"}
  def dump(_), do: :error

  def embed_as(_) do
    :dump
  end
end

Here, cast will be called when creating a new Flint schema from a map, and dump will be used
when calling Flint.Schema.dump/1.

defmodule URL do
  use Flint, schema: [
    field(:content_type, ContentType)
  ]
end

url = URL.new(%{content_type: "application/json"})
# %URL{content_type: :json}

Flint.Schema.dump(url)
# %{content_type: "application/json"}

Examples

You can view the Notebooks folder for some examples in LIivebook.

You can also look at Merquery for a real, comprehensive
example of how to use Flint.

Eiji · June 27, 2024, 6:30am

I wonder if this is supported as wel:

field :favorite_colors, {:array, :string}, subset_of: ~w[red blue green]

I think that :ne is more common than :neq.

There is also other interesting way, i.e. change a keyword list into an expression, for example:

field! :age, :integer, 0 < age < max_age
field! :age, :integer, age in 0..max_age

# order of fields could be important in such case
field! :type, :string, type in ~w[elf human]
field! :age, :integer, (type == "elf" && 0..max_elf_age) || age in 0..max_human_age

Schultzer · June 27, 2024, 12:23pm

Would love to see some of this end up, upstream in Ecto, I’ve always felt it was missing.

acalejos · June 28, 2024, 2:20am

Thanks for checking it out!

Yes, ~w is supported for the validations

You’re probably right about :neq, I couldn’t remember which is more common at the time. You can pass aliases in your config at the very least to rebind the options to whatever you want, so you could do

config Flint, aliases: [ne: :not_equal_to]

Regarding your expressions at the end, the next task I want to do is implement validations with respect to other fields like you wrote.

acalejos · June 28, 2024, 2:21am

Thanks for the support! I’m not sure if something like this would ever make its way to Ecto upstream, since I’m not sure if it goes against a fundamental philosophy or anything like that.

But that’s why I wanted to make it as drop-in as I could so people can easily toggle it

Schultzer · June 28, 2024, 1:35pm

I would say there is precedents, Ecto.Schema — Ecto v3.11.2, but the only way to know for sure is to propose it, I could imagine there would be some benefits of having a bit more information at compile time of a fields constraints.

acalejos · July 5, 2024, 10:01pm

I’ve added some new features, which includes some of the requests from @Eiji

Here’s a brief summary:

`Union`

Union type for Ecto. Allows the field to be any of the specified types.

Validate With Respect to Other Fields

You might find yourself wishing to validate a field conditionally based on the values of other fields. In Flint, you
can do this with any validation! Since all validations already accept parameterized conditions, they also let you refer
to previously defined fields declared with field or field! macros. Just use a variable of the same name as the field(s) you want to refer to, and they will be bound to their respective variables.

Additionally, :when lets you define an arbitrary boolean expression that will be evaluated and pass the validation if it
evaluates to a truthy value. You may pass bindings to this condition just as explained above, and
refer to previously defined fields as just discussed, but uniquely, :when also lets you refer to the current field in which
the :when condition is defined. Theoretically, you could write many of the other validations using :when, but you will
receive worse error messages with :when than with the dedicated validations.

defmodule Test do
  use Flint

  embedded_schema do
    field! :category, Union, oneof: [Ecto.Enum, :decimal, :integer], values: [a: 1, b: 2, c: 3]
    field! :rating, :integer, when: category == target_category
    field! :score, :integer, gt: 1, lt: 100, when: score > rating
  end
end

> Test.new!(%{category: :a, rating: 80, score: 10}, target_category: :a)

** (ArgumentError) %Test{category: :a, rating: 80, score: ["Failed `:when` validation"]}
    (flint 0.0.1) lib/schema.ex:406: Flint.Schema.new!/3
    (elixir 1.15.7) src/elixir.erl:396: :elixir.eval_external_handler/3
    (stdlib 5.1.1) erl_eval.erl:750: :erl_eval.do_apply/7
    (elixir 1.15.7) src/elixir.erl:375: :elixir.eval_forms/4
    (elixir 1.15.7) lib/module/parallel_checker.ex:112: Module.ParallelChecker.verify/1
    lib/livebook/runtime/evaluator.ex:622: anonymous fn/3 in Livebook.Runtime.Evaluator.eval/4
    (elixir 1.15.7) lib/code.ex:574: Code.with_diagnostics/2

Derived Fields

Much like the previous section, derived values let you define
expressions with support for custom bindings to include any field declarations that occur before the current field.

Derived fields will automatically put the result of the :derive expression into the field value. This occurs before
any other validation, so you can still have access to field bindings and even the current derived field value
within a :when validation.

defmodule Test do
  use Flint

  embedded_schema do
    field! :category, Union, oneof: [Ecto.Enum, :decimal, :integer], values: [a: 1, b: 2, c: 3]
    field! :rating, :integer, when: category == target_category
    field :score, :integer, gt: 1, lt: 100, when: score > rating, derived: rating + category
  end
end

Test.new!(%{category: 1, rating: 80}, target_category: 1)

# %Test{category: 1, rating: 80, score: 81}

Bear in mind that the raw quoted expressions are stored in the __schema__(:validations) function so you can use the info as you wish.

The default generated changeset and new functions will use Flint.Schema.validate_fields to perform these validations, so you can leverage that function as well if you override the changeset or new functions.

Let me know what you think!

Eiji · July 5, 2024, 10:28pm

Oh, I see … In that case I have a better solution:

field! :type, :string do
  type not in ~w[elf human] -> "Expected elf or human, got: #{type}"
end

field! :age, :integer do
  age < 0 -> "Nobody can have a negative age"
  type == "elf" and age > max_elf_age -> "Attention! The elf has become a bug! Should be dead already!"
  type == "human" and age > max_human_age -> "Expected human to have up to #{max_human_age}, got: #{age}"
end

If remember correctly (writing from memory) the AST should look like:

[do: [
  {:->, ast_meta, [validation_expression, error_message_or_data]},
  # …
]]

Pros of this solution:

The AST like expr1 -> expr2 is simple to parse.
You can return more than one error easily (simply filter only matching validations on left side and return error messages on the right side)
Allows to return something like:
{:error, field_name: ["first error message", "second error message", …], …}
Custom error messages or even custom data (as there is no need to validate collected data on right side), for example:
{:error, field_name: [{500, "Internal human error"}]}

acalejos · July 6, 2024, 10:38pm

I actually quite like the expressiveness of what you suggested. My main concern is deviating as to add a new arity for the field macros, but I think thats fine. I’ll keep the current validations and :when I think as shorthand options, but allow a :do expression with the syntax you suggested.

acalejos · July 7, 2024, 4:32am

@Eiji

Got something working. What’re your thoughts on something like a finally clause that lets you define transformations for fields that happen after all validations (if successful)?

Eiji · July 7, 2024, 11:18am

Yes and no. Yes as value generation and transformation are often desired. No for making it part of validations which could be confusing for developers and harder to implement.

For a value generation you could write other macro which have one more argument. Let’s call it vritual_field (following ecto’s naming). Having a separate macro would cause less trouble (default values in macro definition) and would be documented separately with a note like:

Works like a field, but generates a value based on other fields

For a transformation we could simply use a map option. This could be supported in both field and virtual_field as it could simplify implementation of generic stuff like a prettify.

field :name, :string, map: &do_something/1

def do_something(name) do
  # no need to validate and return ok/error tuples
  name
end

Here is a list of expressions to support:

simple expression i.e. a + b
function calls like func_name(first_field, second_field)
1-arity anonymous function like &(&1 + 1) or &do_something/1
anonymous function like &do_something(second_field, &1)

In all cases of anonymous functions you have to ensure that &1 is a current field value (only for field macro). The rest cases are expressions with or without function call where you simply have to support field names just like in validations.

If I did not made a mistake here is a list of macro definitions you should support:

defmacro field(name, type, do: block)
defmacro field(name, type, opts \\ []) ← :map support in opts
defmacro field(name, type, opts, do: block) ← :map support in opts
defmacro field!(name, type, do: block)
defmacro field!(name, type, opts \\ []) ← :map support in opts
defmacro field!(name, type, opts, do: block) ← :map support in opts
defmacro virtual_field(name, type, expr, do: block)
defmacro virtual_field(name, type, expr, opts \\ []) ← :map support in opts
defmacro virtual_field(name, type, expr, opts, do: block) ← :map support in opts

acalejos · July 7, 2024, 7:17pm

Is the virtual_field as you describe it the same as the current derived option that’s supported? You’re just suggesting to make it into its own macro?

Eiji · July 7, 2024, 7:44pm

Oh, right - it is

Because of above I’m not sure … First of all derived could be confusing because of: Module # Struct attributes @ Elixir’s hexdocs. It’s more about what somebody likes, but if you ask me I would put it in separate macro as it gives immediately some hint that’s not as same as other fields. That’s said I’m sure many people may disagree with such opinion and suggest adding a separate section in macro documentation about this feature.

acalejos · July 7, 2024, 8:00pm

I actually think it might be worth extracting to a separate macro since the ‘default’ option and the required (!) variant of field are both essentially disregarded.

My initial inclination towards making it an option was due to not wanting to add a new macro to the api, which would deviate even more from standard Ecto, but I think the expressiveness (especially at a glance) it would offer is worth it.

I think as you said there might be a better name than ‘derived’, but I think ‘virtual_field’ might also be confusing since the term ‘virtual’ in Ecto has implications on data persistence.

Maybe a macro called ‘computed’ (inspired by Vue)?

I’m open to suggestions though.

Eiji · July 7, 2024, 8:23pm

Fair point

That way you would end up with copy-paste everything as the ecto’s code is very mature and battle-tested

The point is to have inspiration, understand good standards and use them in your own code. If you have a reason for something or if you think that in your case other naming would be better then change code. I would rather avoid using same patterns if the only reason is to avoid deviation from them. We prefer to use same naming only because the naming itself is good and matches our case. If you have a better synonym then just use it.

Agree, that was just the closest to ecto’s naming. As wrote above if you have better one go for it.

I always had a problem with such naming in English. Yeah, it looks good if we look at general dev-naming, but doesn’t it sounds like it would do some math operation only? You can even remove it and let people simply write a functions for that …

def full_name(%__MODULE__{} = user), do: user.first_name <> " " <> user.last_name`

Another way is to use different naming like key instead of field, so it would be for example gen_value or meta_key … Yeah, I’m definitely not good in it …

acalejos · July 7, 2024, 10:46pm

I’ll probably stick with computed for now.

Thanks for the other thoughts as well!

What did you mean by this? If you use &1 to refer to the current field for a computed value, would that mean that its essentially an inbound transform (pre-validation using the input params) whereas :map is an outbound transform (using the casted, post-validated value)?

If trying to refer to an existing other field, they could always just use a simple expression (#1 in your list)

Eiji · July 7, 2024, 11:03pm

Actually never mind, I was thinking about support for fn … end and &(…), but they are not really needed in DSL. That’s said it could be confusing to not support them as same as standard function calls which are needed in my opinion.

field :data, :integer, map: map_data(data)

def map_data(data) do
  # …
end

That would be helpful if the mapper code is bigger than standard “1 or 2” arithmetic calls …

What I was thinking about was to support equivalent code like:

field :data, :integer, map: &map_data/1

def map_data(data) do
  # …
end

Of course if you want you can make send all fields like:

field :data, :integer, map: &map_data/1

def map_data(%{data: data}) do
  # …
end

but I’m not sure if it’s too much. Additional extra argument could be passed anyway, so there is no need for that …

field :first, :integer
field :second, :integer
field :third, :integer, map: &map_third(first, &1)

def map_third(first, third) do
  # …
end

If you want to support it then all you have to make &map_third(first, &1) work as same as map_third(first, third)

acalejos · July 9, 2024, 2:33am

I got it working to support function calls. It essentially mimics the environment (Macro.Env) of the parent schema (even though embedded_schemas are their own modules) so you can call “local” (to the parent) functions without qualified names, and same with anything imported in the parent.

Eiji · July 9, 2024, 2:47am

Nice! Regarding the warning … I guess it’s because you generate a sub-module in Test, but why?

In general I don’t like generating module and I’m trying to avoid it if needed. However if for some reason you need it then I would allow to name such module, so when reading the code it’s clear more clear (especially for those who already tried ecto).

Finally you should be able to move all imports not used in parent into do … end block in embedded_schema macro call. This way you could generate module and avoid this warning, but I would rather think twice if module generation is really needed.

acalejos · July 9, 2024, 3:07am

Right now it is generating a module to do this. I tried to think of a better way, and the main other option I see (haven’t tested yet) is using Macro.Env.define_import to dynamically import the module during runtime, but 1. That API is new to Elixir 1.17 (which I don’t mainly use yet) and 2. Dyamic imports seem worse to me than compile-time definitions of small modules.

The main reason I’m doing the module generation is that embedded schemas are separate modules but are being defined during the parent’s definition, so I can’t just simply add an import parent in the embedded schema module.

A solution that wouldn’t use module generation would be to just make a function in the parent like

def env, do: __ENV__

and call this at runtime to fetch the environment, but then you wouldn’t get any of the functions defined in the parent and therefore would have to use a fully qualified module name for those calls.

I’m open to suggestions though. The module generation is only really being done to get the Macro.Env to give natural behavior as you would get writing an expression in the parent module, essentially to make it more intuitive to write the expressions.

Again, I’m more okay with compile-time generation that doing something that dynamically imports code at runtime, but I suppose it could be avoided altogether by enforcing the qualified names caveat