Benchee and BencheeCSV - easy and extensible (micro) benchmarking

library
benchee

#1

Hi all,

I just build and released my first hex.pm package. It’s a (micro) benchmarking tool somewhat inspired by ruby’s benchmark-ips. It’s goal is to be easy to use, nice output, extensible through plugins and to provide you with statistics, to better see how reliable your results are. So far the statistics provided are average, iterations per second, standard
deviation and median.

It also comes with the first plugin, BencheeCSV to format output as CSV for easy usage with spreadsheet tools so you can make pretty graphs etc.

Announcement blog post:

benchee

benchee_csv

Thanks, enjoy benchmarking and feedback welcome :slight_smile:

Tobi


Evaluating Elixir / Phoenix for a web-scale, performance-critical application
#2

Hi everyone,

I didn’t want to “spam” the form with release announcements after my initial post about benchee back in June but I figured with all the changes and new features now might be a good time to write something again! If it’s too much, please tell me :slight_smile:

I just released new versions of my benchmarking library benchee along with benchee_csv, also introducing new formatters benchee_json and finally benchee_html to create nice HTML reports, with 4 different graphs that can also be exported as PNG images! And of course, there also is a blog post about all of it.

benchee has come a long way and I’m particularly excited about it supporting running your suite with different inputs as different implementations may behave differently depending on input size or structure. Also I changed the API of the main interface after a short but good discussion in this very forum. And I gotta say it looks way more elixir now :slight_smile:

alias Benchee.Formatters.{Console, HTML}
map_fun = fn(i) -> i + 1 end
inputs = %{
  "Small (10 Thousand)"    => Enum.to_list(1..10_000),
  "Middle (100 Thousand)" => Enum.to_list(1..100_000),
  "Big (1 Million)"       => Enum.to_list(1..1_000_000),
}

Benchee.run %{
  "tail-recursive" =>
    fn(list) -> MyMap.map_tco(list, map_fun) end,
  "stdlib map" =>
    fn(list) -> Enum.map(list, map_fun) end,
  "body-recursive" =>
    fn(list) -> MyMap.map_body(list, map_fun) end,
  "tail-rec arg-order" =>
    fn(list) -> MyMap.map_tco_arg_order(list, map_fun) end
}, time: 10, warmup: 10, inputs: inputs,
   formatters: [&Console.output/1, &HTML.output/1],
   html: [file: "bench/output/tco_detailed.html"]

This then produces outputs thanks to the HTML formatter as you can see in this example report or get a preview with this image:

So yeah, I hope you like it. Would be great to hear what you like, or even better what you are missing, not liking or bugs so that I can improve and extend benchee and its associated libraries :slight_smile:

Thanks!
Tobi


#3

Ooo, awesome! I love benchee and it just keeps getting better! ^.^


#4

It looks great!

I’m very interested in learning about benchmarking, since AFAIK is a subject with a lot of depth into it. Is there something in which you need particular help with contributions?


#5

Thanks, that means a lot :relaxed:

Oh there is a ton of depth to it and I hope the folks and ElixirLive will agree and find it similarly fascinating as I do :smiley:

I haven’t yet read too many papers in depth about it. There’s tons of room for contributions, I usually keep a good backlog of issues/features that have come to my mind so that I don’t forgot them and that possible contributors have a place to get started.

Off the top of my head a couple of particularly interesting/important ones to my mind:

Of course there is much more, new statistics, providing more system information… there’s lots to do :slight_smile:


#6

In case you’re interested and from the greater Hamburg (Germany) area, I’ll be at hh.ex tomorrow talking about benchmarking and benchee with the goal to also hack on benchee together to implement some tiny features (some are especially tagged on github for this).

So if you like here is the meetup page :slight_smile:


#7

New tiny releases of benchee in 0.7, benchee_html 0.2 and benchee_json 0.2 have made their way to hex yesterday evening :slight_smile:

The biggest feature is that the benchee_html report is now properly split up. Other fixes include goodies like relaxing the Poison dependency as well as adjusting some outputs and parallel statistics generation.

More cool stuff to come of course also benchmarking talks will be given:

In case anyone wants t hang out and hack :wink:


#8

I used Benchee couple times, and it’s definitely a solid solution - probably my “go to” one for all my benchmarking needs.

That said, one thing that bothers me each time I look at the README is that the example benchmark is not inside a module. Code that is not inside modules, is not compiled, but interpreted. This gives vastly different performance characteristics and makes benchmarks pretty much useless.
This is not a huge problem in the example, since the functions immediately call a module (so only the initial, anonymous function call is interpreted), but can lead to false results with more complex things.


#9

Hey! Thanks for the input.

Yes, definitely - you should call to functions that are defined in modules somewhere and compiled and that’s how I do it, maybe it should be made clearer. E.g. I usually just have my benchmarks call functions I have defined elsewhere or a couple of ecto functions :wink:

Still, I’d like for the whole suite to be properly compiled as being close to production systems is super important and I can see how people would create erroneous benchmarks through this.

Still - I’d like to know about Elixir works internally and the BEAM VM, but information seems to be really sparse (I read the ELI5 for BEAM and soem usual gotchas but it’s… not much). Pointers are very welcome.

My understanding is that .exs files are interpreted while the .ex are compiled - correct?

The only way, that jumps to mind given my knowledge is correct, is that I’d have people define their code in a module in a .ex and then have a script or some executable call it. Something like:

defmodule MyBenchmark do
  def benchmark do
    Benchee.run(...)
  end
end

And then in a script file just do:

MyBenchmark.benchmark()

Is that the only way? Is there a better way? Your input and or pointers would be highly appreciated @michalmuskala (+ of course everyone else!) :slight_smile:


#10

Modules are compiled, no matter if in .exs or .ex file - the only difference is if there’s a compilation artefact (in the form of a .beam file produced). Code outside modules is not compiled.

When it comes to some reference on BEAM internals, I guess the newly released “BEAM Book” is the best comprehensive source out there https://happi.github.io/theBeamBook/ - it’s not complete, but it’s an awesome resource nonetheless.


#11

Thanks a ton! Will try to get some of that information into the README of benchee and or the wiki :slight_smile:

Also thanks for the link to the book - I had missed that until now. Looking forward to carving out the time to read it!


#12

Benchee 0.9.0 made it to release (Changelog). The main features are gathering more system data and some compatibility for calling benchee from erlang (sample project).

As a side note, I’m at Erlang User Conference atm and will also talk about benchmarking tomorrow. Feel free to come by and chat, love to do that (usually green t-shirt and/or Liefery pullover).


#13

Hello again beloved elixirforum! After quite the break, 0.10 is here with hooks and other great improvements especially in the HTML formatter!


#14

After long discussions of what the API of benchee ought to look like we decided for a drastical change - we need to change the name to bunny!

You can check out the bunny repository or get it straight from hex


#15

@PragTob will you be updating the Elixir School lesson on Benchee to reflect these changes? That would be wonderful :grin:

Realizing the date now.

Perhaps it would have been better to make a new post rather than to edit an existing post about a real package as a prank.


#16

Oh, sorry definitely will! Learners shouldn’t miss out on bunnytastic material :rabbit::rabbit::rabbit:

Maybe I can coax @devonestes into doing it though? :thinking:

Devon, pleeasseeeee!

:rabbit2::rabbit2::rabbit2:


#17

@PragTob is there an overview of the differences between the two api’s? They seem nearly identical to me.

From Benchee:

list = Enum.to_list(1..10_000)
map_fun = fn(i) -> [i, i * i] end

Benchee.run(%{
  "flat_map"    => fn -> Enum.flat_map(list, map_fun) end,
  "map.flatten" => fn -> list |> Enum.map(map_fun) |> List.flatten end
}, time: 10)

From Bunny:

list = Enum.to_list(1..10_000)
map_fun = fn(i) -> [i, i * i] end

Bunny.eat(%{
  "flat_map"    => fn -> Enum.flat_map(list, map_fun) end,
  "map.flatten" => fn -> list |> Enum.map(map_fun) |> List.flatten end
})

Or is the difference in functions other than the main benchmarking function?


#18

This is a very obvious April Fools’ prank…

EDIT: it got me more or less until the code example that shows Bunny.eat(...).


#19

Haha, maybe I shouldn’t be reading the forum so early :sweat_smile:


#20

Sorry if I really fooled anyone :slight_smile:
Wanted to make it really clear through hyperboles, version numbers (1.4.2018 ) and great new features like “bunny assistant” that it’s a joke so that nobody ends up changing all their code to use bunny.

If someone wants to continue using bunny - it’s totally usable. It’s only a thin wrapper around benchee with a couple of defdelegate's and I don’t expect to break it.

To be doubly sure, I’m adding “this was a prank” notices everywhere applicable :slight_smile:

@doomspork Thought about making a new post, but as I once made that for a new release or feature and they got merged together I thought it’s not the way things are supposed to be here :thinking:

edit: can someone from the admins please change the title back? Thanks! :slight_smile: