Ecto map/jsonb persisting keys with null values

cdunn · June 1, 2018, 1:25am

I have an embedded_schema that I am serializing in an embeds_one :map column (to jsonb).
The embedded_schema has a bunch of settings fields where only a small subset of them are used for a given record.

When persisted, the JSONB column contains the entire struct serialized
{“id”: “123”, “setting1”: “value set”, “setting2”: null, “setting3”: null, “setting4”: null, …etc…}

I looked around for some way to filter the keys persisted to just those with a value and thus only persist:
{“id”: “123”, “setting1”: “value set”}

but having some trouble figuring it out…(possibly via some @derive somewhere?)

Maybe I’m missing the purpose of persisting map keys with null values to storage?

Any advice is appreciated!

OvermindDL1 · June 1, 2018, 6:38pm

A key with a null value is distinctly different and thus means something different than the key being gone entirely, thus why they have to be separate states.

An example:

A key with a null value could mean an entry exists but has no associated data.
A key with some value could mean an entry exists and it has that data.
No key/value could mean that such an entry does not exist at all.

Distinctly different states.

cdunn · June 1, 2018, 7:00pm

That makes sense, thanks for the reply!

I guess I’m in a weird situation…maybe abusing the purpose of embedded schemas…where I am only providing a subset of the entries/keys to the changeset (ie: only providing %{“setting1” => “value set”} to the changes on an empty struct but simply because there are other fields defined (ie: “setting2”) on the schema/struct, they are all persisted to storage which seems inefficient at scale (when only “setting1” was toggled, we persist setting2…N => null in every record. Seems like the only alternative would be an EAV structure.

In my case, I end up with every record in my PG table with a ‘metadata’ column looking like:
{
“id”: “124c595c-ef41-4963-b905-542e412637ff”,
“max”: null,
“min”: null,
“formula”: null,
“precision”: null,
“date_format”: null,
“default_text”: null,
“include_time”: null,
“allow_multiple”: null,
“allow_negative”: null,
“select_options”: null,
“currency_symbol”: null,
“duration_format”: null,
“require_checked”: true
}

Was mostly curious if this is intentional or just a side effect of serializing an empty elixir struct to json

dimitarvp · June 2, 2018, 2:59pm

You don’t know that for a fact and you should not try to optimize things prematurely.

However, if you do need arbitrary keys to be in there that vary between records then not persisting the nil values might be desirable.

What’s your scenario and desired outcome anyway?

cdunn · June 2, 2018, 3:14pm

In the application I’m currently building I have the dreaded user-defined “custom fields” on system objects issue. (think a user defining a custom select field “Plan name” on a system “Customer” record)

So I have a user able to define a CustomField schema which may be of many different types (text, select, checkbox, phone number, …~15 others). I realize now in listening to the responses that an alternative would be to have separate embedded schemas for each potential type of field…so I plan to investigate whether there is a clean polymorphic embedded record story.

dimitarvp · June 2, 2018, 3:18pm

In that case you should IMO just use Postgres document storage extension with thin Ecto wrapping code in your Elixir app.

You’ll tear your hair out trying to fiddle with polymorphic embedded structures.

blatyo · June 3, 2018, 3:34am

If you switch from using an embedded schema to just a map field, ecto will only persist exactly what you tell it to. If you really want a struct to wrap the data, you could write a custom ecto data type that dumps and loads from a map.