What’s the safe way to decode a JSON string into a struct? I want to avoid calling String.to_atom
. Jason.decode
can give me a map with string keys, but struct()
expects atom keys.
Is using Poison an option?
Quick example
defmodule Foo do
@derive [Poison.Encoder]
defstruct [:bar]
end
defmodule Example do
def test do
s = "{\"bar\":\"baz\"}"
Poison.decode!(s, as: %Foo{})
end
end
iex(1)> Example.test
%Foo{bar: “baz”}
There are probably more modern ways to do it, but this is was I have been using. You can also nest structs this way.
The number of json encode/decoders seems to grow everyday. https://package-rank.com/wp/hex/poison/-vs-/hex/jason
Map.new(%Foo{}, fn {key, _} -> {key, json[Atom.to_string(key)]} end)
You can also always decode manually and completely explicitly. It is more typing but it also allows you to change the struct or the parameters independently. It is not always the best way but sometimes is.
defmodule Foo do
defstruct [:bar, :baz]
end
defmodule Example do
def test do
s = '{"bar":"abc", "baz": 42}'
json = Poison.decode!(s)
%Foo{
bar: json["bar"],
baz: json["baz"],
}
end
end
Thanks for the suggestions. I was considering switching from Poison to Jason, hence the question. Looks like I’ll stay with Poison. I’d consider the the manual decoding if this was something bigger, but in this case it’s just a very simple one-to-one mapping.
As of Jason version 1.2.0 decode/2
now supports keys
option.
It’s worth to mention though it can lead to DoS attack when json data is user controlled.
I don’t know when it was introduced, but I guess it is somewhat relevant if someone stumbles upon it. :keys
options offers :atoms!
which only convert to already known atoms… therefore mitigating the DoS risk.
that’s nice.
Say I have this typespec:
@type foo :: :bar | :baz
is :bar
an exisiting atom now? Yes it is:
_a = :foo
String.to_existing_atom("foo")
String.to_existing_atom("bar")
String.to_existing_atom("non_exsisting")
* 1st argument: not an already existing atom
:erlang.binary_to_existing_atom("non_exsisting", :utf8)
(temp_atomtest 0.1.0) lib/temp_atomtest.ex:10: TempAtomtest.test/0
You can also use an ecto schema then cast to the struct. you can also use helpers to make the casting more generic.
That means the atoms will always exist because they’ll be in the function definition, but requires that you know the shape of the json ahead of time a bit. Which may or may not work for your use case.
Just tried that and noticed, that Ecto does not know a :atom
type: Ecto.Schema — Ecto v3.11.1
Also I can’t explicitly set a :id
field (:id is already set on schema
) but each element in my JSON-collection has an id
.
Never really used Ecto, so I’m a little lost here.
But it seems like the right approach, eg I can load the JSON-objects in a changeset an perform some extra checks that JSON-schema can’t, eg if a reference is not a dead link.
I think that @primary_key false
would to the job. By default, every schema has an :id
primary key.
If you’re casting JSON into something like the @type foo :: :bar | :baz
from a previous post, a specific Ecto.Type
will be safer and clearer than the non-existing :atom
field type.
Thanks guys, works like a charm.
Am I doing it right?
In my data I have static stuff (here: hobbies and jobs which can be referenced in a list or as a single atom, this works). Also I have data (here person) that has an integer ID. These objects may
- reference each other (here friends)
- reference other objects (say we’d have a pets field that references pets by
[Pet.id_t()]
I see how I could first load all pets and then check in a person-changeset-validation if the referenced pets exist. But I can’t do that with persons referencing other persons, because they may not be loaded yet. So I’d need a second run, right?
defmodule Person do
use TypedEctoSchema
@type id_t() :: non_neg_integer()
@primary_key false
typed_schema "person" do
field(:id, :integer)
field(:name, :string, null: false)
field(:age, :integer) :: non_neg_integer()
field(:job, EctoAtom) :: Job.id_t()
field(:hobbies, {:array, EctoAtom}) :: [Hobby.id_t()]
field(:friends, {:array, :integer}) :: [Person.id_t()]
end
end
defmodule EctoAtom do
use Ecto.Type
def type, do: :atom
def cast(data), do: {:ok, String.to_atom(data)}
def load(data), do: {:ok, String.to_atom(data)}
def dump(atom), do: {:ok, Atom.to_string(atom)}
end
defmodule Job do
@type id_t() :: :job_mechanic | :job_doc | :job_programmer
end
defmodule Hobby do
@type id_t() :: :hobby_painting | :hobby_freeclimbing | :hobby_stampcollecting
end
iex> data = %{id: 1, name: "Bob", age: "18", job: "job_programmer", friends: [2, 4711], hobbies: ["hobby_freeclimbing", "hobby_painting"]}
...
iex> p =
...> Ecto.Changeset.cast(%Person{}, data, Map.keys(data))
...> |> Ecto.Changeset.apply_changes()
%Person{
__meta__: #Ecto.Schema.Metadata<:built, "person">,
age: 18,
friends: [2, 4711],
hobbies: [:hobby_freeclimbing, :hobby_painting],
id: 1,
job: :job_programmer,
name: "Bob"
}
EDIT: I created a behaviour for the atom-types and I like it. I think I’ll use this.
Quick question: Why not use Ecto.Enum
for the job and {:array, Ecto.Enum}
for the hobbies?
I need a real Atom-Type (which Ecto does not offer). I another context, hobbies may be used as a single atom. (like job).
Conceptually, you can’t determine if a field like friends
has valid values in it without a larger context than the single Person
.
When persisting things to the database, the database acts as that context.
If you aren’t persisting the data, then that context is the whole group of Person
structs being decoded.
That’s not exactly a “second run”, but it’s similar:
- decode each
Person
- pass the whole list to a function that checks
friends
for internal consistency: refers to people that exist, is properly reflexive (if desired)
Re: EctoAtom
- String.to_atom
still makes people pretty worried. What about specific types for the various things:
defmodule Job do
@type id_t() :: :job_mechanic | :job_doc | :job_programmer
use Ecto.Type
def type, do: :atom
def cast("job_mechanic"), do: {:ok, :job_mechanic}
def cast("job_doc"), do: {:ok, :job_doc}
def cast("job_programmer"), do: {:ok, :job_programmer}
def cast(_), do: :error
def load("job_mechanic"), do: {:ok, :job_mechanic}
def load("job_doc"), do: {:ok, :job_doc}
def load("job_programmer"), do: {:ok, :job_programmer}
def load(_), do: :error
def dump(:job_mechanic), do: {:ok, "job_mechanic"}
def dump(:job_doc), do: {:ok, "job_doc"}
def dump(:job_programmer), do: {:ok, "job_programmer"}
def dump(_}, do: :error
end
Then the fields are more specific about what they contain:
field(:job, Job) :: Job.id_t()
field(:hobbies, {:array, Hobby}) :: [Hobby.id_t()]
I addressed that with the EctoAtomID-behaviour: Create a behaviour that uses Ecto.Type - #3 by Sebb.
Would be nice if I could generate the id-type:
@type id_t() :: :job_mechanic | :job_doc | :job_programmer
from the ids
list in the macro also. I’ll give that a try.
I’ll try to abstract the reference-integrity-checks also. Maybe I’ll just dump everything into a database to get better checks, but this seems weird because I don’t really need that, I just want a safe way to get some related collections from json into Elixir structs and then torture that data with some pipes.
This is a great start!
If you keep what you have I would change your EctoAtom module to use String.to_existing_atom
though to avoid the risk of atom table exhaustion as an attack vector (especially relevant as we are dealing with JSON parsing).
As mentioned rather than having an EctoAtom type you could leverage the Ecto.Enum type like so:
field(:job, Ecto.Enum, values: [:job_mechanic, :job_doc, :job_programmer])
field(:hobbies, {:array, Ecto.Enum},
values: [
:hobby_painting,
:hobby_freeclimbing,
:hobby_stampcollecting
]
)
Possibly even calling out to a canonical list somewhere if you want a Hobby module:
field(:hobbies, {:array, Ecto.Enum}, values: Hobby.values)
This will successfully cast to atom values if given a string that matches, but will fail the casting if the field is not a string or atom in the given values
list. It also means you don’t risk the whole “atom exhaustion” thing too because the atoms are declared as the enum values and String.to_existing_atom
is used internally.
Finally you might look at ecto_morph as sugar around casting too.
I didn’t know about
values
maybe I really should RTFM.
EDIT: OK I did RTFM, but I can’t find values
option for field