Hi @PragTob!
Thank you for your awesome work on benchee!
I have a question regarding std dev. When benchmarking a function that serializes some struct into a binary, I am noticing enormous std dev, like 21k%
Operating System: Linux
CPU Information: Intel(R) Core(TM) i5-9600K CPU @ 3.70GHz
Number of Available Cores: 6
Available memory: 15.55 GB
Elixir 1.14.2
Erlang 25.1
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 2 s
reduction time: 2 s
parallel: 1
inputs: none specified
Estimated total run time: 11 s
Benchmarking raw_attr.encode ...
Name ips average deviation median 99th %
raw_attr.encode 5.17 M 193.34 ns ±21393.07% 137 ns 162 ns
Extended statistics:
Name minimum maximum sample size mode
raw_attr.encode 132 ns 80013984 ns 8.88 M 136 ns
Memory usage statistics:
Name Memory usage
raw_attr.encode 32 B
**All measurements for memory usage were the same**
Reduction count statistics:
Name Reduction count
raw_attr.encode 1
**All measurements for reduction count were the same**
This basically means that I shouldn’t look at the average result as it might not be reliable. What about adding outlier detection to remove such measurements? Is this something that is planned or welcomed as a contribution?
I am also curious what can be the reason of such a big std dev.






















