Benchee & formatters - easy and extensible (micro) benchmarking

This is why this a bad idea. You’ve edited a real post and now require admins to come undo a prank.

I got Aprils Fools when I was a child. As an adult it seems like a waste of everyone’s time, now to include the forum admins who have to edit the post.

:-1:

Edit: Furthermore, you’ve parked a package on Hex that no one else can use now as a prank. If everyone starts doing this hex will be a wasteland.

This is truly in poor taste

1 Like

Sorry you don’t like this, I didn’t mean to offend anyone with this. It was meant to put a little smile on some peoples face.

I didn’t edit the title of the post, it was done by an admin (I can’t edit it myself and wouldn’t have done so).

Regarding hex, the package remains functional - it’s not anymore a wasteland than people who build something and push it up for fun and then abandon it.

2 Likes

That’s something we shouldn’t encourage people to do though and when respected members of the community start to do it, it signals to other people that it’s acceptable.

Just something to keep in mind.

Ninja Edit: You’re right though. There is a lot of abandoned packages and placeholders. It’s something we should actively work to fight against. I’m the first to comment on posts of threads in which people advocate parking names or pushing up empty/useless packages.

1 Like

I fell for it as well and changed the title of the thread to reflect the ‘new’ name :icon_redface: have reverted the changes.

It would be good to free-up the Hex name in case someone has a use for it.

4 Likes

I wonder if there are any policies in place to deal with unused packages?

In a time, long, long ago (maybe 1-2 years now?) I believe Eric actually went through Hex and booted off empty and abandoned packages. I’m not sure it’s actively enforced.

But it’s a real problem. I came across a blog post last week that was just the author encouraging people to push up Hex packages with hello world in them. :cold_sweat:

2 Likes

Wow hat sounds bad :frowning:

I just checked and there’s no way in the API that let’s you delete a package. You can only “retire” package versions that shouldn’t be used anymore. But they’re still available.

I mean, I guess it’s good to avoid a “left_pad”-situation… but maybe if a package has all its versions retired and they have 0 “recent” downloads" maybe then a package can be fully retired and the namespace reclaimed?

edit: Also thanks @AstonJ and sorry - I really didn’t wanna full anyone :icon_redface:

I didn’t want to be the first to say it, but I agree. Claiming a name on hex is bad. You should have been sure that it could be reversed before publishing it.

Regarding the prank, you should never assume people will see through a prank on the internet, especially when such people might come from a completely different culture or have a poor grasp of the language you’ve written in (I’m not saying any of these was at play here, this is just a general admonition).

In my opinion (and this is a very subjective topic), a new forum tolic would have been acceptable, as would have been a new github repo, but claiming a hex package name is in really bad taste. Changing the title of the topic was also kinda bad, but I can see it wasn’t your fault.

These kinds of pranks only work well in a broadband communication channel (like communication in real life) in a relatively homogeneous community. Communication through text with strangers on he internet is as far from the above as you can get…

1 Like

No problem about the title Toby - I was in ‘auto’ mode so didn’t really pay any more attention than I felt I had to.

I wonder if it’s is worth bringing the wider issue - of unused or ‘throwaway’ uses of Hex - to the attention of @ericmj - Eric, we can split this into a dedicated thread or start a new one if you think it’s worth discussing further?

1 Like

Parking names or package squatting is against the code of conduct so if you notice it please let us know at support@hex.pm. The bunny package does not break this rule since it has functionality.

Unused or abandoned packages will always stay on hex.pm because there is no way of knowing if they are in fact unused in reality. We have never gone through and deleted unused packages, but we may delete packages with no functionality or with only “hello word” because they are against the squatting policy.

You can take over the name of an abandoned package but all old versions will remain and can only publish new versions.

4 Likes

benchee 0.13.0 is finally released and ready for you to try out brand new memory measurements :wink: changelog

4 Likes

And 0.13.1 is live for your pleasure - includes fixes to memory measurements that were exposed by @michalmuskala! Fun stuff like you know, heap_size in the garbage collection information isn’t actually the heap size but the size of the young generation apparently. Anyhow, we’re happy if more people try out memory measurements and break them.

Changelog

3 Likes

And 0.14.0 is finally live for your pleasure! The major feature is drastically increased precision when measuring: from microseconds to nanoseconds! There’s a blog post showing off the most important changes but you can also check out the Changelog with all the changes.

6 Likes

It has its own thread but I think it’s worth mentioning here as well: benchee_markdown - the first formatter not written by me/benchee maintainers which makes me very happy because that’s what I always wanted. :slight_smile:

Also we’re gearing up for 1.0 (finally!) - stay tuned :sunglasses: Not too many changes incoming but more the promise of a stable API/data structure until 2.0 which is hopefully a long way off.

As a preparation of this and showcasing that benchee is a project many people contribute to (be it code, reviews, input etc. @devonestes @wasnotrice or “just” ideas, testing and help with a lot of the really hard stuff or the occassional emoji react so one knows something is a good idea @OvermindDL1 @michalmuskala) benchee and related libraries were moved to https://github.com/bencheeorg :slight_smile: Other good side effect, people other than myself (namely devon and eric so far) can add people manage things etc.

@mods / @AstonJ / ??? could we maybe rename this thread to just be “benchee & formatters - easy and extensible (micro) benchmarking”, because I mean there’s benchee_html, benchee_json we could also name them all in the title but that’d get long and just having benchee/benchee_csv there seems kinda irritating :slight_smile:

3 Likes

So here is what was promised! Benchee 0.99 and 1.0 (along with all plugins) are a reality :tada: :tada::tada:

There aren’t many super hardcore new features, except for showing the absolute difference in the average of measurements to help arguments & informed decisions. Otherwise it’s some polish, documentation, house keeping etc.

0.99 introduces a bunch of deprecation warnings, when you eliminate them you should be good to upgrade to 1.0.

Changelog
Blog Post

I also want to thank everyone here for their input etc :green_heart:

Happy benchmarking

8 Likes

PSA for benchee users on Mac doing nano level benchmarks: There’s some weird behaviour in erlang time measurement on MacOS - if you do nanosecond/low microsecond level benchmarks on MacOS this likely affects you and generates wrong results. More information here: https://github.com/bencheeorg/benchee/issues/313

2 Likes

Hello friends, long time no see. Enjoy a new Benchee version with a bugfix (finally!), reduction counting and profiler after benchmarking (yay!) as well as some musings on Open Source and why it took so long.

7 Likes

Hi @PragTob!

Thank you for your awesome work on benchee!

I have a question regarding std dev. When benchmarking a function that serializes some struct into a binary, I am noticing enormous std dev, like 21k%

Operating System: Linux
CPU Information: Intel(R) Core(TM) i5-9600K CPU @ 3.70GHz
Number of Available Cores: 6
Available memory: 15.55 GB
Elixir 1.14.2
Erlang 25.1

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 2 s
reduction time: 2 s
parallel: 1
inputs: none specified
Estimated total run time: 11 s

Benchmarking raw_attr.encode ...

Name                      ips        average  deviation         median         99th %
raw_attr.encode        5.17 M      193.34 ns ±21393.07%         137 ns         162 ns

Extended statistics: 

Name                    minimum        maximum    sample size                     mode
raw_attr.encode          132 ns    80013984 ns         8.88 M                   136 ns

Memory usage statistics:

Name               Memory usage
raw_attr.encode            32 B

**All measurements for memory usage were the same**

Reduction count statistics:

Name            Reduction count
raw_attr.encode               1

**All measurements for reduction count were the same**

This basically means that I shouldn’t look at the average result as it might not be reliable. What about adding outlier detection to remove such measurements? Is this something that is planned or welcomed as a contribution?

I am also curious what can be the reason of such a big std dev.

Heyo,

yes your average is skewed by a lot. I recommend looking at the median and mode which give you a realistic picture of what is happening.

You could also use the HTML formatter - that one gives you graphs that show the distribution to get an even better view.

Essentially what you’re doing is a nano benchmark - most benchmarking tools (and normal measuring tools) don’t even allow that level of precision. For best results the benchmark should be the only thing running on your OS (not even a UI otherwise) but that won’t get rid of it all as well - if your OS decides to not schedule/run the erlang process for even 1ms that’s even already an outlier of like 10_000 x looking at your numbers there and hence you get the big std dev.

I thought there already was an issue for removing outliers as I had been thinking about it a lot, but apparently there is not (just benchmark until confidence value reached which is similar but different: Option to benchmark until confidence value is reached · Issue #9 · bencheeorg/benchee · GitHub )

I should create one, but I’m also unsure what method to use but iirc there was a definition that I used to know and explain in one of my talks but I forgot. I should probably rewatch one of the talks to see how to implement it :joy:

2 Likes

Created an issue: Built in outlier detection · Issue #382 · bencheeorg/benchee · GitHub

Further thoughts: esp. due to outliers etc. average is often a terrible value but one we’re used to. I’d always look at all values, but also esp. the median. I talk about that a bit here if you wanna know more: Stop Guessing and Start Measuring - Benchmarking in Practice - Tobias Pfeiffer - EUC17 - YouTube

Another reason for the stddev is just the GC running as well/growing the process memory etc.

1 Like