Announcing Data Schemas! Flexible, declarative descriptions of how to create structs from input data

Data schemas are declarative descriptions of how to create a struct from some input data. You can set up different schemas to handle different kinds of input data. By default we assume the incoming data is a map, but you can configure schemas to work with any arbitrary data input including XML and json.

Data is selected from the input data and passed to a casting function before being set as a value under a key on the struct you want to build.

Check out the docs / guides and README for more detailed information on how it works but below is a flavour of what you can do.

A simple struct

First, let’s assume that your input data is a map with string keys. DataSchemas really shine when working with APIs because we can quickly convert an API response into trusted elixir data:

input = %{
  "content" => "This is a blog post",
  "comments" => [%{"text" => "This is a comment"},%{"text" => "This is another comment"}],
  "draft" => %{"content" => "This is a draft blog post"},
  "date" => "2021-11-11",
  "time" => "14:00:00",
  "metadata" => %{ "rating" => 0}
}

Now let’s define a schema to create a BlogPost struct from the above input data:

defmodule BlogPost do
  import DataSchema, only: [data_schema: 1]

  data_schema([
    field: {:content, "content", &BlogPost.to_okay_string/1},
  ])
  
  def to_okay_string(value) do
     {:ok, to_string(value)}
  end
end

The above is equivalent to:

defmodule StringType do
  @behaviour DataSchema.CastBehaviour

  @impl true
  def cast(value) do
    {:ok, to_string(value)}
  end
end

defmodule BlogPost do
  import DataSchema, only: [data_schema: 1]

  data_schema([
    field: {:content, "content", StringType},
  ])
end

Now you have defined your schema you can simple call DataSchema.to_struct/2:

DataSchema.to_struct(input, BlogPost)
# => %BlogPost{content: "This is a blog post"}

A more complex example

You can define a few kinds of fields, see the docs for more info but here is a more complex example introducing more field types:

  defmodule DraftPost do
    import DataSchema, only: [data_schema: 1]
    data_schema(field: {:content, "content", StringType})
  end

  defmodule Comment do
    import DataSchema, only: [data_schema: 1]
    data_schema(field: {:text, "text", StringType})
  end

  defmodule BlogPost do
    import DataSchema, only: [data_schema: 1]

    @mapping [
      field: {:date, "date", &Date.from_iso8601/1},
      field: {:time, "time", &Time.from_iso8601/1}
    ]
    data_schema(
      field: {:content, "content", &DataSchemaTest.to_stringg/1},
      has_many: {:comments, "comments", Comment},
      has_one: {:draft, "draft", DraftPost},
      list_of: {:list_of, "comments", &{:ok, &1["text"]} },
      aggregate: {:post_datetime, @mapping, &BlogPost.to_datetime/1}
    )

    def to_datetime(%{date: date, time: time}) do
      NaiveDateTime.new(date, time)
    end
  end

DataSchema.to_struct(input, BlogPost)
# The above returns:
{:ok, %DataSchemaTest.BlogPost{
  list_of: ["This is a comment", "This is another comment"],
  comments: [
    %DataSchemaTest.Comment{text: "This is a comment"},
    %DataSchemaTest.Comment{text: "This is another comment"}
  ],
  content: "This is a blog post",
  draft: %DataSchemaTest.DraftPost{content: "This is a draft blog post"},
  post_datetime: ~N[2021-11-11 14:00:00]
}}

Different Input Data - aka Are these not just embedded_schemas from ecto?

The examples so far have shown functionality that is very similar to what you can get from Ecto’s embedded schemas and data casting capabilities. However, in DataSchema we can also provide different data accessors. This allows us to defines schemas that can be casted from different input data, for example…

XML Schemas

Let’s imagine that we have some XML that we wish to turn into a struct. What would it require to enable that? First a new Xpath data accessor:

defmodule XpathAccessor do
  @behaviour DataSchema.DataAccessBehaviour
  import SweetXml, only: [sigil_x: 2]

  @impl true
  def field(data, path) do
    SweetXml.xpath(data, ~x"#{path}"s)
  end

  @impl true
  def list_of(data, path) do
    SweetXml.xpath(data, ~x"#{path}"l)
  end

  @impl true
  def has_one(data, path) do
    SweetXml.xpath(data, ~x"#{path}")
  end

  @impl true
  def has_many(data, path) do
    SweetXml.xpath(data, ~x"#{path}"l)
  end
end

Let’s define our schemas like so:

defmodule DraftPost do
  import DataSchema, only: [data_schema: 1]

  @data_accessor XpathAccessor
  data_schema([
    field: {:content, "./Content/text()", StringType}
  ])
end

defmodule Comment do
  import DataSchema, only: [data_schema: 1]

  @data_accessor XpathAccessor
  data_schema([
    field: {:text, "./text()", StringType}
  ])
end

defmodule BlogPost do
  import DataSchema, only: [data_schema: 1]

  @data_accessor XpathAccessor
  @datetime_fields [
    field: {:date, "/Blog/@date", &Date.from_iso8601/1},
    field: {:time, "/Blog/@time", &Time.from_iso8601/1},
  ]
  data_schema([
    field: {:content, "/Blog/Content/text()", StringType},
    has_many: {:comments, "//Comment", Comment},
    has_one: {:draft, "/Blog/Draft", DraftPost},
    aggregate: {:post_datetime, @datetime_fields, &NaiveDateTime.new(&1.date, &1.time)},
  ])
end

And now we can transform as above:

source_data = """
<Blog date="2021-11-11" time="14:00:00">
  <Content>This is a blog post</Content>
  <Comments>
    <Comment>This is a comment</Comment>
    <Comment>This is another comment</Comment>
  </Comments>
  <Draft>
    <Content>This is a draft blog post</Content>
  </Draft>
</Blog>
"""

DataSchema.to_struct(source_data, BlogPost)

# This will output:

{:ok, %BlogPost{
   comments: [
     %Comment{text: "This is a comment"},
     %Comment{text: "This is another comment"}
   ],
   content: "This is a blog post",
   draft: %DraftPost{content: "This is a draft blog post"},
   post_datetime: ~N[2021-11-11 14:00:00]
 }}

Data Accessor - An Access example.

Let’s look back at our map version.

input = %{
  "content" => "This is a blog post",
  "comments" => [%{"text" => "This is a comment"},%{"text" => "This is another comment"}],
  "draft" => %{"content" => "This is a draft blog post"},
  "date" => "2021-11-11",
  "time" => "14:00:00",
  "metadata" => %{ "rating" => 0}
}

We could define a data accessor that looks like this:

defmodule AccessDataAccessor do
  @behaviour DataSchema.DataAccessBehaviour

  @impl true
  def field(data, path) do
    get_in(data, path)
  end

  @impl true
  def list_of(data, path) do
    get_in(data, path)
  end

  @impl true
  def has_one(data, path) do
    get_in(data, path)
  end

  @impl true
  def has_many(data, path) do
    get_in(data, path)
  end
end

Now we can define our schema:

defmodule Blog do
  import DataSchema, only: [data_schema: 1]

  @data_accessor AccessDataAccessor
  data_schema([
    list_of: {:comments, ["comments", Access.all(), "text"], &{:ok, to_string(&1)}},
  ])
end

And create a struct from this:

input = %{
  "content" => "This is a blog post",
  "comments" => [%{"text" => "This is a comment"},%{"text" => "This is another comment"}],
  "draft" => %{"content" => "This is a draft blog post"},
  "date" => "2021-11-11",
  "time" => "14:00:00",
  "metadata" => %{ "rating" => 0}
}
DataSchema.to_struct(input, Blog)
# Returns:
{:ok, %Blog{comments: ["This is a comment", "This is another comment"]}}

This is still an early version. There are some planned upcoming features before a v1 but it is certainly useable as is.

8 Likes

Update:

Livebooks added to the repo.