Does make sense to ecto to handle raw data structures instead of only maps?

I’m the author of the peri library that aims to replicate the idea of plumatic/schema library from the clojure ecosystem.

The idea is to define data validation and schema definition with raw elixir data structures and also handle raw data as input, so it creates an extensibility and schemas combinations.

simple raw data example:

schema = {:integer, {:eq, 12}}
Peri.validate(schema, 12)
# {:ok, 12}
Peri.validate(schema, "12")
# {:error, %Peri.Error{}}

complex raw data structure example:

schema = {:list, {:tuple, [:atom, :any]}}
Peri.validate(schema, ok: 1)
# {:ok, ...}
Peri.validate(schema, foo: :bar, bar: [:hello])
# {:ok, ...}

peri also supports schema definition as maps (atom or strings keys) and keywords, so it’s kinda a merge of plumatic/schema, nimble_options and ecto

This introductions aims to reach the following discussion: peri users often asks for type casting as ecto does, and i probably will implement it, but i also would like to kinda integrate peri with ecto, something like Peri.to_changeset/1 function that receives a peri schema definition and returns a schemaless changeset.

So, given that, does make sense to ecto to handle raw data structures at least for schemaless changesets? Something like:

type = :string
param = "some string"

type
|> Ecto.Changeset.cast(param)
|> Ecto.Changeset.validate_length(min: 5)
|> Ecto.Changeset.apply_action(:parse)
# {:ok, "some string"}

I do understand that will probably be a breaking change since all Ecto.Changeset.validate_* functions expect a field argument and also Ecto.Changeset.cast/3 that expects only maps, but the idea of this post is to start the discussion around this topic and not discuss “how the implementation would be”, wdyt?

didn’t find a structured/centralized discussion like that but i do remember of seeing some questions or issues that mentioned this topic

also, i’m not trying to “change” ecto to fit my library needs, but my library need kinda inspired this discussion

1 Like

That’s an interesting question but before we discuss: do you really want the tuple type inside the database? Imagine somebody wants to load data from the same database, using a programming language that does not have tuples. What will they do?

That a good question, however this discussion tries to decouple the ecto from being database-centric. Schemaless changesets were the first step to this idea and allows to data map and validate raw maps that wouldn’t be inserted into database. Database types handling and so on should be handled on ecto adapters and the library

You may find Ecto.Type.cast/2 more useful for this - if you skip all the map-based parts of Ecto.Changeset.cast, it is what’s left:

My most recent project was a data platform, where part of the api contract had a field to specify the expected types of attached data, since we needed to store them as strings in the lake and type cast and/or type transform in the warehouse in an ELT process.

The principles behind it sound similar to what you are trying to achieve.
The suggestion for Ecto.Type is a valid one, but since you are writing a library it might be worth considering that you might be reaching for a package where it’s not necessary.

The types and supported types are relatively finite, building that without a package at all is quite feasible.

We wrote the casts ourselves and while it has some domain logic and type signatures for our api, this was all we needed.


defmodule TypeCasts do
  @moduledoc false

  @spec cast(String.t(), String.t()) :: {:ok, any()} | {:error, map()}
  def cast(nil, _value), do: {:error, nil}
  def cast(_type, nil = _value), do: {:ok, nil}
  def cast("int", ""), do: {:ok, nil}
  def cast("int", value), do: parse_value(value, &Integer.parse/1, "int")
  def cast("float", value), do: parse_value(value, &Float.parse/1, "float")
  def cast("string", value), do: {:ok, value}
  def cast("text", value), do: {:ok, value}

  def cast("bool", value) do
    parse_value(
      value,
      fn
        "true" -> {true, ""}
        "false" -> {false, ""}
        _anything_else -> {nil, :error}
      end,
      "bool"
    )
  end

  defp parse_value("", _parse_fun, _type), do: {:ok, ""}
  defp parse_value("NaN", _parse_fun, "float"), do: {:ok, nil}

  defp parse_value(value, parse_fun, type) do
    case parse_fun.(value) do
      {parsed_value, ""} -> {:ok, parsed_value}
      {parsed_value, ".0"} -> {:ok, parsed_value}
      _error -> {:error, %{value: value, data_type: type}}
    end
  end
end

3 Likes

There are a lot of libraries in the ecosystem that seek to do this kind of thing. I’ve even written my own which works with XML and JSON, and in theory any kind of input data. I even originally tried to get XML parsing / validating working with ecto custom types but it was a non starter.

I don’t think retrofitting more general data casting capabilities into ecto really makes sense because it’s a stable lib used by a lot of people and it was really designed with db tables and webforms in mind. It would likely be a big change and it’s not clear what the benefit would be.

2 Likes

We are in agreement, I was suggesting not using ecto for this, typing is a small set of its functionality and it does not make sense to expand the functionality for a specific use case, when handrolling that use case is trivial.

My example does not use types and I definitely don’t think it’s the right case for it.
I’ve had two instances where I needed Ecto custom types.

  • I needed to roll compression + decompression for massive blobs
  • I stored all options in the db as well for history, but the code worked best with keyword lists, so used an Ecto Type to handle that. I tried casting and it did not work as well.

My general point was this problem feels like it’s defined enough and contained enough to justify rolling your own solution.

I think I replied to the wrong message :see_no_evil: but yes agree with what you were saying.

1 Like