Casting into maps with string keys in Ecto

Hey folks,

I’m currently working on a new feature for Keila that lets users define custom fields for their newsletter sign-up forms.
I want to use Ecto to cast and validate data according to the settings chosen by the user. Since the field names are user-defined, I can’t use atom keys, though.

The solution I’ve come up with works like this:

1 ) Take params with the custom field names (e.g. %{"data" => %{"myCustomField" => "abc"}})
2) Create a mapping of the string keys to generic atoms (e.g. :field_1 instead of "myCustomField")
3) Create a changeset with the generic atom keys and the mapped params
4) Map it back into its final form if it’s valid.

I’ve posted the module I’ve built here: Ecto StringMap · GitHub
This is an example using the module: Ecto StringMap example · GitHub

I’m quite happy with this solution and thought I’d share it in case anyone else is looking to achieve something similar.

There’s one thing I’m not happy about, though: I haven’t found a way to automatically hook into the functions that take care of casting regular embeds. So if you want to actually insert your StringMap changeset into your repo (instead of just using it in a Phoenix form), you have to call EctoStringMap.finalize_string_map/1 on the parent changeset.

I’d be happy about any comments or suggestions for a more elegant solution :slight_smile:

1 Like

What does converting to indexed atom keys afford you? Couldn’t you just embed the whole custom map under the :data key?

1 Like

The casting casts the values to the right types (e.g. boolean, integer, etc) and allows the use of validations.

Edit:
Essentially, it takes converts this:

%{"data" => %{"myIntField" => "123", "myBoolField" => "true"}}

into

%SomeStruct{
  data: %{
    "myIntField" => 123,
    "myBoolField" => true
  }
}

So it tries to emulate an embedded field but with dynamic string keys instead of a fixed embedded schema.

But you don’t need random atom keys to do that do you? Can’t you use a schemaless changeset to do your casting and validation?

As far as I can tell, it’s not possible to use schemaless changesets with string keys.

Edit:

types = %{"name" => :string, "email" => :string, "age" => :integer}
params = %{name: "Callum", email: "callum@example.com", age: 27}
changeset = {%{}, types} |> Ecto.Changeset.cast(params, Map.keys(types))
# => (ArgumentError) cast/3 expects a list of atom keys, got key: `"age"`

did you infer the types somehow? did the user provided a proper mapping of fields and types?

if the user provided the list of fields, i guess you’re constrained enough to convert the field names to atom and use it for the schemaless changeset.

No, because users can theoretically create an infinite number of field names that would never get garbage collected, I can’t use atom keys. If you take a look at the example linked in my first post, you’ll see how the fields are defined.

my point is that it has a practical constraint, the fields had to be previously defided. you’re not turning any key into an atom but the ones that you already know.
problem with turning strings into atoms is not garbage collection but actually the limit on the amount of atoms that the vm supports, and this limit can be increased if you need to.

i’d rather monitor the amount of atoms a running system have, trigger a warning when it reaches 10% of the vm limit and increase it to be 30% higher. this way you have a more controlled way to deal with the atoms in the system and avoid breaking the system by monitoring your limits.

The fields are not previously defined in code; any user can create as many fields as they like at any time. That’s why converting them to atoms would be a bad idea.

This can be done for sure. I don’t think the limitation of ecto only being able to deal with atom keys should be the reason to go that route though. Using a tool, which is not limited like that makes more sense imo. Yes changesets are everywhere, but I don’t think they’re particularly well suited for validating dynamically defined schemas. Not because of their validations, but due to how much code of and around them expects fields to be atoms.

1 Like

my understanding is that runtime limitations are arbitrary and if your usecase require it it’s ok to adjust it to your usage. stuff like the limit of atoms or ulimit in OS, are there to prevent something unexpected to happen, but you can change it if you’re expecting it to happen.

my reasoning is that every time we add a layer of abstraction(StringMap thing presented) on top another abstraction(ecto changesets) you either need to control both abstractions or the differences between the abstractions gonna come back and bite you.

i don’t think ecto has a limitation on using atom keys for any reason except for a side-effect of how the library developed. initially all changesets required a schema, since a schema is a struct, the keys will always be atoms. now thath schemaless changesets are common some edge cases may appear. maybe it’s even a good case to propose to support it

1 Like

Even a higher arbitrary limit doesn’t help if the number of items can grow unboundedly. If the tradeoff would involve a lot of work on the other end then surely go for it. If it doesn’t why add in this footgun?

Yes. Hence my suggestion to not try to bolt on top of changesets in the first place. Doing validation (especially as dynamically as suggested here) doesn’t really benefit from 80% of what they do in the first place (from ectos end). Imo the integration of changesets with phoenix forms is really what makes using other libraries for validation such a problem.

Pure string keys might work, but I’d imagine it would be a huge refactor. It’s not just the data that changes, but e.g. errors and all the other things refering to keys would no longer be able to default to atoms, no keyword lists anymore and a lot more of such things. E.g. I once asked if the field(binding, :field) macro on queries could support strings and the answer was that it would be a huge change due to all the places depending on it being an atom and the core team won’t do that.

Mixed keys are and should never be allowed. They’re just bound to cause issues.

5 Likes

That’s really ultimately why I went with building this small layer on top of Ecto changesets instead of building something separate. Of course it’s possible to implement the Phoenix.HTML.FormData protocol for a custom data structure, but that would probably have been even more work.

The other reason for building this on top of Ecto changesets is that the data does end up in an Ecto changeset anyways, so it’s not like I wouldn’t be using Ecto otherwise.