JSON serialization performance benchmarks

aseigo · December 19, 2017, 2:00pm

Our application does a fair amount of work with reasonable size JSON documents that range in size from 500KB to 200MB in size. So performance of the JSON serializer matters here, more so than in most web applications sending around small bits of json around an API exposed over HTTP. So I sat down and measured some (most?) of the JSON libraries available.

The test consisted of taking a smallish file 1.5MB in size with a bit fewer than 2600 JSON objects in it, each of which have a dozen or so fields, including some further objects. The data in this file was then parsed a number of times (6 by default) in serial and then again in parallel by Poison, Jiffy, JSON, JSX, and ExJSX.

Here is an Elixir module containing the benchmark, and to use it you will need the following in your mix.exs file:

  {:jiffy, "~> 0.14.11"},
  {:json, "~> 1.0"},
  {:jsx, "~> 2.8"},
  {:exjsx, "~> 4.0"},
  {:benchee, "~> 0.9.0"},
  {:benchee_html, "~> 0.3"},
  {:poison, "~> 3.1"}

You can see typical results here. I have run this benchmark several times now and the results are rather consistent. It is impressive how much faster Jiffy is, which can be explained by it being implemented as a NIF, but also how well Poison does given it is written in pure Elixir.

Note that the parallel and sequential versions are not really directly comparable as there is extra overhead there in the form of a fairly naive message passing mechanism to wait on the returns, but as that overhead is the same across all parallel runs they are comparable. It is interesting to see how the different libraries experience different amounts of speed ups, ranging from ~3.5x to ~4.5x, with Jiffy getting the least amount of benefit, again probably due to being a NIF.

Thought it might be of interest to others .. I’ll be looking at memory consumption next as well as running with larger files as well …

Cheers!

gdub01 · December 19, 2017, 2:16pm

@michalmuskala has written https://github.com/michalmuskala/antidote Would be interesting to see it compared.

NobbZ · December 19, 2017, 2:42pm

It seems as if you have missed tiny as well.

aseigo · December 19, 2017, 2:51pm

Will add both and update …

Updated

Antidote is really, really impressive indeed… Tiny is a small step up over Poison, but Antidote does indeed really keep up with Jiffy, and catches up as it moves into parallel territory.

… now if Antidote gets a release and replaces Poison … (appropriate naming and all) … given they have the exact same public API, would be amazing to just see Poison literally replaced with Antidote under the name Poison so everyone gets the upgrade for ‘free’. Perhaps that’s even the plan?

michalmuskala · December 19, 2017, 3:41pm

That’s exactly the plan - to follow the poison interface, but exclude some features that make the implementation slow (particularly pretty printing). The plan is also to later have a _native package with implementation in C/C++/Rust - by just adding it as a dependency, all calls (in your application and dependencies would use the native implementation).

One last thing to do before the release is naming. I was advised to avoid the name antidote (to avoid looking like hostility towards poison and also because there’s already a project called antidote within the broad BEAM ecosystem). The candidates right now are xson and xjson (with the first one winning for me).

If everything goes right, the package will be released as xson before Christmas. I planned to release today, but I found some bugs in the unicode escaping thanks to property testing (they are already fixed).

Also, here are my latest benchmarks. They show my implementation outperforming jiffy when used with HiPE - unfortunately, there are some downsides to running it, so it’s kind of cheating.

mischov · December 19, 2017, 4:24pm

Oh yeah, I’ve heard of xson! That’s json in xml, right?!?!

dom · December 20, 2017, 12:47am

Would it be possible to allow users to choose HiPE on/off by gating the @compile directive with config?

(I think ERL_COMPILER_OPTIONS="[native,{hipe, [o3]}]" mix deps.compile antidote should also work, although it’s a bit different since all modules will be hipe’d)

aseigo · December 20, 2017, 9:11am

First: really nice to hear your plans. They sound really good, and hearing that you are dropping some features like pretty print means it can’t / won’t be a rename-and-succeed-Poison. Hopefully projects will move to the new library as soon as possible though, given the speed difference.

And on that note! Here we are parsing a real-world 349MB JSON document, using the same regime as before: 6 times in serial, then 6 parallel runs. Again, antidote does really well, and I wonder if a native implementation is really needed/required?

I also looked at how fast python is at this and it is about the same speed as Poison.

michalmuskala · December 20, 2017, 9:46am

I’m not opposed to having a pretty: true option. What I mean, though, is that it has to be implemented outside the main generator - the easiest solution probably would be with a separate “prettify” pass after the json is already generated (this way, it doesn’t matter how custom encoder implementations are done). This will be slow, but you don’t use pretty-printing when you need top speed anyway (at least that’s my impression). The implementation should be also pretty straightforward - a simple state machine counting the numbers of {}[]" characters it saw in the stream.

Yeah, I’m not sure I will actually get around to doing that I have yet to see what would be possible by comparing with some of the fastest native implementations - rapidjson in C++ and serde in Rust.

Yeah, I think something like config :xjson, compile_native: true should be fine (and that’s also what poison does).

aseigo · December 20, 2017, 10:12am

Yep, it’s for debugging, non-production logging and similar and if you have GIANT amounts of data where that slow down really will shine through you probably are going to forgo that again …

rapidjson handles that same 349MB json file in ~1.8s (though that’s just the parsing, not actually creating any data structures with it) with warm file caches. There would be the usual nif overhead, but still probably quite a bit faster … shrug For us, for now at least, it looks like Antidote will be enough™

Cheers!