Saving an Ecto embedded_schema without empty embeds_many values

I am using Ecto embeds to store multiple Ecto structs in a Postgres jsonb column.

My issue is that if an embeds_many model in an embedded_schema does not contain
any data, a json value of "model": [] is still inserted into the jsonb column.

For example, I have a Message model with an entities column, and this is
the data that is saved in the database as the entities:

{"id": "18ebed18-568f-4583-b154-4dc44497998b", "urls": [{"id": "c82369d3-de6b-4d39-83ef-b42cf3484603", "url": "example.com", "description": null}], "topics": []}

Whereas, I would like it to be saved as follows, (notice "topics": [] is not present)

{"id": "18ebed18-568f-4583-b154-4dc44497998b", "urls": [{"id": "c82369d3-de6b-4d39-83ef-b42cf3484603", "url": "example.com", "description": null}]}

Here is the Message model

defmodule App.Message do
  use App.Web, :model

  schema "messages" do
    field :text, :string
    embeds_one :entities, App.Entities

    timestamps()
  end

  def changeset(struct, params \\ %{}) do
    struct
    |> cast(params, [:text])
    |> cast_embed(:entities)
    |> validate_required([:text])
  end
end

The Entities model

defmodule App.Entities do
  use App.Web, :model

  embedded_schema do
    embeds_many :urls, App.UrlEntity
    embeds_many :topics, App.TopicEntity
  end

  def changeset(struct, params \\ %{}) do
    struct
    |> cast(params, [])
    |> cast_embed(:urls)
    |> cast_embed(:topics)
  end
end

The UrlEntity model

defmodule App.UrlEntity do
  use App.Web, :model

  embedded_schema do
    field :url, :string
    field :description, :string
  end

  def changeset(struct, params \\ %{}) do
    struct
    |> cast(params, [:url, :description])
    |> validate_required([:url])
  end
end

And the TopicEntity model

defmodule App.TopicEntity do
  use App.Web, :model

  embedded_schema do
    field :name, :string
  end

  def changeset(struct, params \\ %{}) do
    struct
    |> cast(params, [:name])
    |> validate_required([:name])
  end
end

Running the following function will create the message with the entities saved as shown above.

%Message{}
|> Message.changeset(%{
  text: "example.com",
  entities: %{urls: [%{url: "example.com"}]}
})
|> Repo.insert!

Removing |> cast_embed(:topics) from the Entity changeset results in the same outcome.

There are two reasons it would be desirable for the saved value to not contain "topics": []:

  1. Save database disk space

  2. Postgres JSONB queries using NOT NULL

Ideally we could query SELECT * FROM messages WHERE entities->'topics' IS NOT NULL;.
But this query will return results if "topics": [] is present.

I would appreciate if anyone had thoughts on how to accomplish this using Ecto embeds, or perhaps a custom Poison encoder.

2 Likes

Did you find a solution for this? I am currently wondering how to do it too.

To achieve this I think there’s no way around creating a custom or parametrized type. See: Ecto.Type — Ecto v3.7.0