Ecto - embeds_many as single map with unique keys

I have a Parent model, which embeds_many() children models. Children are then stored in a jsonb column as a List (Array in JSON) of simple, two-key maps (Objects), where values of one of the keys of those maps need to be unique. Something like:

[
	{"name":"name0", "amount":"10.00"},
	{"name":"name1", "amount":"12.20"},
	{"name":"name2", "amount":"10.00"},
	[…]
	{"name":"nameN", "amount":"56.00"}
]

Value of name has to be unique. I handled the uniqueness on create/update and this works. But then I continuously spend cycles all around the application converting the retrieved list into a single map like:

%{
	"name0" => "10.00",
	"name1" => "12.20",
	"name2" => "10.00",
	[…]
	"nameN" => "56.00"
}

Any suggestions for a “proper” way to get the data stored as a single map (JSON object), while retaining the possibility to use Phoenix form helpers / inputs_for, validations and other goodies?

Custom Ecto type? Or? Some examples, maybe?

1 Like

This is full of caveats:

  • inputs_for only works for related assocs or embeds, which are one or a list of related records. There’s nothing like a map for related records.
  • The :map or {:map, value_type} native types of ecto are not supported by phoenix form handling natively. You can make things work by manually creating input names and assigning values to inputs, but it’s not covered ootb. Also using an ecto type over a relationship means you forego a bunch of their features like e.g. per item errors, sort/drop support, likely a few more…
  • How do you even model a “map” using form inputs in an abstract manner in the first place. There’s surely many specific examples of how to do that, but I’m not sure there’s a generic version phoenix could implement.

So really I’d treat the data like you do and let the goal of it becoming a map be an implementation detail. You could consider separating the “write model” from the “read model” here and let the read model read the data from the db in the correct format / into a :map field.

1 Like

Thank you, Benjamin. I spent good part of the day trying to figure something reasonable out and I ran into basically all the caveats you listed with no apparent solution. At least none that would not make things even more smelly than they are now.

One thing made me curious though. Since I have the write part working, and the main problem is with the fact that I read the data in the form it is stored (i. e. as List of Maps rather than a Map), then – maybe – the separation of models you mentioned is worth checking.

How does one do this? Any examples or links to documentation?

Something like this:

defmodule MapFromArray do
  use Ecto.Type

  def type, do: {:array, :map}

  def load(data) do
    with {:ok, list} <- Ecto.Type.load(type(), data) do
      {:ok, Map.new(list, &{&1["name"], &1["amount"]})}
    end
  end

  def embed_as(_), do: :dump

  […]
end

defmodule ReadParent do
  use Ecto.Schema

  schema "parents" do
    field :children, MapFromArray
  end
end
  
  
Repo.insert_all("parents", [
  %{children: [%{name: "abc", amount: 2}, %{name: "def", amount: 5}]}
])

[%{children: %{"abc" => 2, "def" => 5}}] = Repo.all(ReadParent)

Could even consider Ecto.ParameterizedType if you don’t like hardcoding key/value keys.

2 Likes

I’m in a similar situation right now and did something similar as suggested here: I expect every module that is used as the value type to define a new function and call that in order to load the data which currently means to call MyModule.changeset and Ecto.Changeset.apply_changes. The data will always be transformed from a JSON list representation:

field(:entities, EctoSupport.EmbeddedMap,
  value: Entity,
  key_name: :name,
  default: %{}
)

In this usage example Entity would need to define a new/1 function which converts the given map to a struct. And the returned struct must have field called name which will serve as the key when transforming from List to Map.

The actual implementation of EctoSupport.EmbeddedMap looks like this:

defmodule EctoSupport.EmbeddedMap do
  use Ecto.ParameterizedType

  def type(_params), do: :array

  def init(opts) do
    Code.ensure_compiled!(opts[:value])

    if !function_exported?(opts[:value], :new, 1) do
      raise "Module #{opts[:value]} given for value must implement new/1, got #{inspect(opts[:value].__info__(:functions))}"
    end

    %{
      value: opts[:value],
      key_name: opts[:key_name] || :id
    }
  end

  def cast(v, params) when is_list(v) do
    {:ok, from_json_list(v, params)}
  end

  def load(db_value, _loader, params) when is_list(db_value) do
    {:ok, from_json_list(db_value, params)}
  end

  def dump(data, _dumper, _params) when is_map(data) do
    {:ok, Map.values(data)}
  end

  defp from_json_list(value, params) do
    value
    |> Enum.map(&params[:value].new(&1))
    |> Map.new(&{get_in(&1, [Access.key(params[:key_name])]), &1})
  end
end

This however turns out to not work recursively: If the Entity defines another field using EmbeddedMap , that nested field is stored as the Map which breaks upon deserialization. I suppose I need to properly use the dumper provided to properly dig into the nested data, but so far I haven’t figured out how that would need to be called or passed. Could anyone here give me a pointer? The documentation at Ecto.ParameterizedType — Ecto v3.13.3 eluded me.

I got this to work recursively, but I think I have a more precise question now: Is there some public Ecto API to dump a whole struct?

When writing testcases for nested datasets I figured out that Ecto probably uses something along the lines of Ecto.Type.dump(MyModule.__schema__(:type, :field))to dump individual fields. So I’m now doing the same when dumping map values. The new implementation looks like this:

  def dump(data, dumper, params) when is_map(data) do
    {:ok, Map.values(data) |> Enum.map(& dump_struct(&1, dumper, params))}
  end

  # Checks whether any of the fields on the given struct are an embedded
  # map themselves. If this is the case the field is dumped with the type
  # information attached to the schema.
  defp dump_struct(data, dumper, params) when is_struct(data, params.value) do
    keys = data |> Map.from_struct() |> Map.keys()
    Enum.reduce(keys, data, fn key, data ->
      case params.value.__schema__(:type, key) do
        {:parameterized, {EctoSupport.EmbeddedMap, _params}} = t ->
          {:ok, transformed} = dumper.(t, get_in(data, [Access.key!(key)]))
          put_in(data, [Access.key!(key)], transformed)
        _ ->
          data
      end
    end)
  end

This gives me recursion on my own types and has so far passed my synthetic and practical tests. But if feels weird as I …

  • enumerate over all fields instead of somehow specifically grabbing those with matching types …
  • needed to jump through the Access.key! hoops to do the manipulation on the struct on the fly …
  • and disregard every field that is not a EmbeddedMap.

So there is probably a nicer way to do this.

1 Like

Additional warning for anyone following this route: If you miss the def embed_as(_), do: :dump that was given in the example, you are going to have a painful time debugging why your maps are “sometimes” stored directly as map instead of being converted to a list … In my case this only surfaced for types that are children of polymorphic_embeds_many or polymorphic_embeds_one.