Polymorphic embed in Ecto

alexcastano · January 10, 2020, 9:34am

Hello,

I want to store a structured raw input in a JSONB column. This raw input is already in an embedded schema because it has been previously validated. However, the raw input can be of different type, some of the types being quite complex, as they have deeply nested structures.

When I had only 4 different types, I had a column for each one and I handled the polymorphism by myself:

schema "table" do
  ...
  embeds_one type_one, TypeOne
  embeds_one type_two, TypeTwo
  ...
end

But now, I want to add a lot more types, and I don’t think is efficient to have 30 different columns. So I would like to store the information in the same column, but when loading the record using a type field, it would load one structure or another.

I’m trying to use Ecto.Type to do this job, but I find myself doing a lot of work that was previously handled by Ecto:

First, I have to @derive {Jason.Encoder, only: [...]} in a lot of schemas. It feels wrong because I’m forced to use this protocol to store this information in one way. If I want to use the same protocol for another use, for example to encode in a JSON API, I won’t be able to do it in the future.

Also, I’m creating a lot of functions to load the raw JSON in the structures, basically using cast and cast_embed with all the stored fields.

So, I don’t know if I’m missing something, but I’d like to use functionalities that are already in Ecto with embeds_one to load, dump, embed_as, etc. my custom Ecto.Type. Do you think it would be possible?

Do you think is a good solution in general terms? Any advice?

Eiji · January 10, 2020, 9:59am

@alexcastano Of course it’s possible, but ecto itself does not helps in complex polymorphic use cases, so you would need to write it yourself.

First of all you would need to have your JSON like:

{"data": …, "type": "your_type_name"}

For it you would need to create a custom Ecto.Type. For each callback you need to check type field and based on it you would need to create n number of modules which also implements Ecto.Type.

Let’s say:

defmodule MyApp.EctoTypes.MainJSON do
  use Ecto.Type

  def ecto_callback(%{data: data, type: type}) do
    with {:ok, result} <- type |> find_module() |> :apply(:ecto_callback, [data]) do
      %{data: result, type: type}
    end
  end

  defp find_module("first_type"), do: MyApp.EctoTypes.FirstType
  defp find_module("second_type"), do: MyApp.EctoTypes.SecondType
  defp find_module("third_type"), do: MyApp.EctoTypes.ThirdType
end

defmodule MyApp.EctoTypes.FirstType do
  # no need to use Ecto.Type here
  # as this those "sub types" would be used only for Kernel.apply/3
  # in order to simply separate code for each type

  def ecto_callback(data) do
    # …
  end
end

In such way you can simply write a code for each type. You only need to implement each Ecto.Type required callback (cast, dump and load if I remember correctly).

If you do this all you need to do (like migrations) is exactly the same as with embeds_many or embeds_one. I wrote all from memory, but it should work even with arrays.

defmodule MyApp.MyContext.MySchema do
  alias MyApp.EctoTypes.MainJSON

  schema "table_name" do
    …
    field(:field_name, {:array, MainJSON})
    …
  end
end

The only difference is that you no longer need to call cast_embed and related functions as everything would be handled in cast.

alexcastano · January 10, 2020, 10:58am

Thank you for your response.

What I was wondering if I can save myself writing all the code of MyApp.EctoTypes.FirstType, MyApp.EctoTypes.TwoType, etc; because Ecto already knows how to dump and load an embed_schema by itself. I mean, if I have:

 defmodule MyApp.OneType` do
  use Ecto.Schema
  embed_schema do
     ...
  end
end

defmodule Myapp.Foo do
  use Ecto.Schema
  schema "foos" do
     embeds_one :one, OneType
  end
end

# Here Ecto casts and dumps OneType
foo = 
  Ecto.Changeset.cast(%Foo{}, data, [])
  |> Ecto.Changeset.cast_embed(:one)
  |> Repo.insert!()

# Here Ecto casts and loads OneType
Repo.get(Foo, foo.id)

So what I’d like to do would be:

defmodule MyApp.EctoTypes.MainJSON do
  use Ecto.Type

  def dump(%{data: data, type: type}) do
   # This returns MyApp.OneType
   module = find_module()
   # This is the line I hope Ecto writes for me
   dump = Ecto.Type.dump(module, data)

   {:ok, %{data: dump, type: type}}
  end

   # Something similar to cast, load, embed_as, equal?, etc
   ...
end

al2o3cr · January 10, 2020, 12:23pm

Have you looked at https://github.com/greenboxal/ecto_poly ? I haven’t tried it yet, but it seems pretty close to what you’re looking for.

EDIT: looks like Ecto 3 compatibility is still in-flight https://github.com/greenboxal/ecto_poly/pull/1

Eiji · January 10, 2020, 4:41pm

Just for sure you can just use map and don’t care about custom types and validations, but again you would lose a validations and if something goes wrong you would need to migrate all of inserted rows which I don’t think would be a nice for your “free weekend” time. In such case just use :map type in field/2 macro.

Those all custom types are for validating each type separately. Nobody forces you to validate every data you have.

tme_317 · January 10, 2020, 7:31pm

I had a similar issue and couldn’t figure it out so I am just storing compressed ETF in a bytea database field that way I can store arbitrary highly nested structs/embedded_schemas in a single database field.

It works using a custom Ecto type that dumps using :erlang.term_to_binary and loads using :erlang.binary_to_term

Of course it’s not JSON so you can’t do PgSQL queries against it but I don’t need to do that in my use case.

alexcastano · January 15, 2020, 11:25am

Yeah, exactly. This was what I was looking for. I’ll take a look as soon as possible. Thank you.

The point is that I’ve already had the embedded schema validated. The only thing I want to do is the part of dump and load.

Awesome! I’ve never thought about this solution. It doesn’t work for my current case, but I find it very interesting.

tme_317 · January 15, 2020, 6:25pm

I just found this issue/conversation that seems exactly what you are trying to do… maybe this helps?

alexcastano · January 17, 2020, 10:00am

Yes! As far as I understood, they use EctoMorph to encode and decode the EmbedSchema in a JSON column. EctoMorph uses Ecto Reflection to build this functionality. They also use the Ecto.Type to know which EmbedSchema it should be built. It is kind of manual work, but it works :).

I think EctoPoly tried to do something very similar, more finished solution but it is not 100% working with the lastest Ecto versions.

mathieuprog · May 30, 2020, 4:15pm

I published a library that brings support for polymorphic embeds