JSON extensions such as NaN handling

json

#1

I recently added a proposal for Jason to add an option to allow decoding and encoding of NaNs (not a number) and ± infinity. This would allow decoding strings such as

{ "v1": NaN, "v2": Infinity }

So far the discussion was mainly between @OvermindDL1 and myself, and I would like to get more feedback on this.

Here is the gist of it.

Fact: The JSON spec doesn’t support it. Furthermore, it explicitely discourages it.

Numeric values that cannot be represented in the grammar below (such as Infinity and NaN) are not permitted. (RFC 7159)

Fact: Quite a few parsers out there have the option to deal with NaNs. Many of these even have this option enabled by default (which leads to quite a bit of “JSON” out there which is not spec compliant):

  • Python enables NaNs by default and provides an option to be spec complient

    > import json
    > data =  [float('nan'), float('inf'), float('-inf')]
    > data
    [nan, inf, -inf]
    > json.dumps(data)
    '[NaN, Infinity, -Infinity]'
    
  • GSON (JVM) provides an option to deal with NaNs but is spec compliant by default

  • Julia provides an option to deal with NaNs but is spec compliant by default

  • OCaml Yojson enables NaNs by default and offers a flag to be spec compliant

Fact: Various parsers do not allow for NaNs. This includes

  • JavaScript
  • Go
  • Rust
  • Currently all Elixir JSON libs

Question: What is an appropriate choice for a JSON parser?

  1. Be spec compliant and do not provide options for deviating from that.
  2. Allow for NaN parsing on the basis that JSON in the wild may be not spec compliant and the only way to properly deal with such strings is to write a JSON parser that can deal with these.

Looking forward to feedback.


#2

What would be the expected behaviour in Elixir when parsing a JSON document with NaNs? BEAM doesn’t support NaN nor infinities, so I would expect to get back an error saying the input can’t bee decoded.

But if we’re talking about an extension to the parser, then it should probably define its own representation for such invalid values, e.g. parse NaNs into :NaN, and user code will have to depend on this particular representation. Moreover, I assume there’s not universal representation for encoding such value in different languages (it’s not part of the spec and there’s no separate spec for handling NaNs and infinities).

These observations lead me to the conclusion that it won’t be sufficient to add an option to a JSON library to enable support for non-conformant values because there may be different representations for such values in the wild.

A proper solution from a JSON library perspective would be to allow the user to hook into the parser and extend it in the desired way. So I see it as an extension mechanism rather than a single configuration option.