JSON Library Benchmarks

sunny-g · June 13, 2019, 2:50am

In this same vein, I recently published serde_rustler, which tl;dr lets you serialize and deserialize Elixir terms within your Rust NIFs. To benchmark them, I transcode from Elixir term -> serde -> json string and vice versa, and compared it against the other JSON libraries I could find.

They look, lets say, promising. I’d appreciate if anyone could tell me if I did something wrong to get these results: encode, decode.

@michalmuskala can you explain why the parallel option would distort the results, when in production you’d be running your app with load from other parts of your code?

michalmuskala · June 13, 2019, 7:43am

Since serde_rustler is a NIF that doesn’t yield, it should be running on dirty schedulers - after a quick glance at the code, I think it doesn’t do that right now. In production environment, NIFs that don’t yield can wreck havoc on the runtime. BEAM relies on the fact that scheduler threads checkpoint frequently with some VM services. Arranging NIF code so that in can yield properly usually has noticeable runtime cost - that’s probably the main source of speed-up over jiffy for your code. Running on dirty schedulers, on the other hand, has the context switch overhead, but more importantly has the caveats of running on a thread pool - if you exhaust the thread pool, the operations become blocking in practice.

What I meant about the parallel option is that the variations in the results are going to be much higher - that’s expected and has various sources - the VM overhead, other processes in the OS, etc.

sunny-g · June 13, 2019, 2:56pm

I believe I am using the dirty scheduler for these functions, which I believe is why they run faster on larger inputs (Issue 90 and GovTrack), but I think I see your point - jiffy might be/is “pre-emptively” handling yielding (rather than what my first approach would be, which is to switch between the two functions depending on the size/complexity of the input).

And thats good to know about the parallel option, I ought to add to/replace the existing benchmarks with non-parallel versions and publish the results. Thanks!

michalmuskala · June 13, 2019, 7:27pm

Ah, I must have missed it. It would be interesting to compare the dirty implementation, jiffy and elixir implementations in a situation where you have more work than threads - e.g. with parallel option higher than the number of cores.

dch · June 24, 2019, 5:58am

I’m always interested that people only measure speed of benchmarking. Memory consumption throughout (de)marshalling is also significant. You will find that the NIF-based approaches win hands down here.

wrt @michalmuskala’s comments around jiffy & scheduler usage, this is worth understanding in detail. jiffy has had support for handling reductions since a while, but you may need to tune bytes_per_red to get the best performance, and see if using dirty schedulers makes a difference as well.

coby · June 24, 2019, 6:11am

I did add memory consumption to my benchmarks when I was testing, that was actually one of the big things that I was missing from a lot of other benchmarks as well. However, I didn’t tune anything in the bytes_per_red for jiffy, so I’m not sure if there are ways to make it even more efficient…

coby · June 24, 2019, 6:13am

We’ve been looking at using protobuf/grpc to try and avoid JSON APIs, but since JSON is a simple standard to easily get up and running with, that’s always where we tend to go.

keathley · June 24, 2019, 12:56pm

Yeah we have a similar situation. Protobuf is fine but gRPC adds a ton of additional complexity and operational costs. If the main goal is to have a more formal api definition and you don’t need the streaming and multiplexing from gRPC then I would look at https://github.com/twitchtv/twirp. I’m exploring an elixir implementation for work because it allows an easy on road into our existing ecosystems and solves the main pain points we have.

coby · June 24, 2019, 1:19pm

I took a look at Twirp when Twitch put it out a bit ago, I remember being impressed, although I hadn’t started playing with Golang yet so I didn’t mess around with it at all. I’d love to see an implementation in Elixir for that.

What do you think about Apache Thrift? I know Pinterest uses it extensively, they’ve written a couple of Elixir libraries for it as well:

keathley · June 24, 2019, 2:09pm

I haven’t used thrift in anger. The general feeling I get when talking to people about it is that it doesn’t feel very well maintained and its less feature rich than gRPC but thats all hearsay. If I was going to take on the operational burden of an RPC layer like that then I would probably just spend the time, albeit enormous, to get gRPC working. It seems to me like thats where most people are moving.

But for us synchronous api calls over http 1.1 have worked OK thus far. We’re at pretty high scale and we’re still not at the scale where we need to be highly optimizing our transport layers. Our primary technical bottleneck is the encoding and decoding time and json is the slowest in that arena. The much larger problem is the ability to scale our teams, our knowledge, and the duplication of effort required to maintain a bunch of different clients. Something like twirp helps to alleviate those pain points and is easier to add to our existing stack because it works over http 1.1 and retains the existing semantics for sending requests. You could probably achieve the same thing with json schema or swagger. But my experience with those tools has been poor and I’m not sold on them as a sustainable solution.

dimitarvp · June 24, 2019, 8:14pm

Have you guys considered Google’s FlatBuffers? I’m eyeing it for a while now but haven’t had the time to properly try and benchmark the Elixir implementation.

OvermindDL1 · June 24, 2019, 8:36pm

Huh, that’s weird that it only handles the dynamic flatbuffers when flatbuffers is all about the static schema (which would be a great fit for macro’s to generate the module!).

keathley · June 24, 2019, 8:55pm

I haven’t personally. I typically go from json to msgpack if I just wanna get an encoding/decoding performance boost and smaller payloads. If I need schema’s (which is rare) then I use protobuf or avro. I’ve been aware of flatbuffers but they’ve never really fit a use case or problem I had. My default is to use msgpack whenever possible as its widely supported and is a “free” upgrade over json for 98% of use cases.

OvermindDL1 · June 24, 2019, 9:09pm

The benefit of flatbuffers is no encoding/decoding steps in most languages (and in fact barely one on the beam), and for languages that support it the data can be read directly with no translations at all, in addition the schema means that useless data like named fields don’t need to be specified as everything become positional thus more tightly packed. Overall it is significantly faster than protobufs or msgpack (at least in C++ where I use flatbuffers), I’m curious how that would translate to the BEAM when using a well made library for it.

keathley · June 24, 2019, 9:17pm

Sure, I get how they work and why they’re useful. But for all practical purposes msgpack tends to be fast enough for a lot of my use cases and much more widely supported. If you’re going to use json you can almost certainly drop in msgpack instead and it’s an overall win. I also don’t need to squeeze every last microsecond out of my response times. At some point there’s diminishing returns :).

OvermindDL1 · June 24, 2019, 9:25pm

Oh I don’t use it for web stuff, I use it mostly for network transmission of packets that I need to both be as fast and tiny as possible, not just on the transmission time but on actually (de)serializing the packet. ^.^

HTTP overhead will swamp any real gain from it anyway. However, not everyone uses Elixir for just http stuff (I use it for a lot of non-http stuff).

dimitarvp · June 24, 2019, 10:35pm

Do you have any benchmarks and/or observations on how much quicker msgpack is compared to JSON in your Elixir projects?

keathley · June 24, 2019, 10:52pm

With our payloads encoding and decoding with msgpack is about 2x-3x faster than json and there’s about a 15% reduction in payload size.

dimitarvp · June 24, 2019, 11:10pm

Do you have a set of anonymised payloads we could make a GitHub repo to benchmark with? I’d contribute a FlatBuffers section to the initial JSON / msgpack benchmarks.

keathley · June 24, 2019, 11:23pm

I don’t but I can try to put something together. I suspect our payloads aren’t that unusual though so you could probably use any standard payloads for benchmarking