I noticed that :rand
module (standard library) is a little bit slow when the algorithm I was using is heavily relying on random number generator. I checked the source code and realized that it’s written in erlang instead of native code (e.g. BIFs).
Then I come up with a question:
if
:rand
module is written in native code, how much performance it could gain ?
Therefore, I wrote a NIF in Rust (via rustler) for :rand.uniform
and made a benchmark comparison between :rand
module and newly created RandNif
module. The benchmark result is as following:
Compiling NIF crate :randnif (native/randnif)...
Finished release [optimized] target(s) in 0.06s
Operating System: macOS
CPU Information: Intel(R) Core(TM) i7-3720QM CPU @ 2.60GHz
Number of Available Cores: 8
Available memory: 16 GB
Elixir 1.8.1
Erlang 21.2.4
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 10 s
memory time: 5 s
parallel: 1
inputs: none specified
Estimated total run time: 1.13 min
Benchmarking :rand.uniform/0...
Benchmarking :rand.uniform/1...
Benchmarking RandNif.uniform/0...
Benchmarking RandNif.uniform/1...
Name ips average deviation median 99th %
RandNif.uniform/0 145.35 K 6.88 μs ±192.62% 5.96 μs 14.30 μs
RandNif.uniform/1 71.77 K 13.93 μs ±38.32% 12.55 μs 22.28 μs
:rand.uniform/0 44.99 K 22.23 μs ±42.47% 20.53 μs 42.35 μs
:rand.uniform/1 37.81 K 26.45 μs ±34.67% 24.31 μs 43.62 μs
Comparison:
RandNif.uniform/0 145.35 K
RandNif.uniform/1 71.77 K - 2.03x slower
:rand.uniform/0 44.99 K - 3.23x slower
:rand.uniform/1 37.81 K - 3.84x slower
Memory usage statistics:
Name Memory usage
RandNif.uniform/0 3.11 KB
RandNif.uniform/1 1.56 KB - 0.50x memory usage
:rand.uniform/0 11.97 KB - 3.85x memory usage
:rand.uniform/1 10.41 KB - 3.35x memory usage
**All measurements for memory usage were the same**
As we can see, :rand.uniform/0
is 3.23x slower than RandNIf.uniform/0
.
(I am not very confident with this part. Please correct me if I am wrong.) But NIF is not free, it has some cost as well I believe. Therefore, I made another simple experiment here and measured that the cost of NIF call is about 67% of RandNif.uniform/0
by comparing a noop NIF function with RandNif.uniform/0
, which may implies that it would be 3x faster if it’s BIF instead of NIF.
Therefore, if :rand.uniform/0
is written in native code (BIF), with my non-scientific estimation, it would be 9.69x faster (3.23 x 3).
Then I had a brief look with source code of erlang’s :rand
module and Rust’s rand module (the one used in NIF) and I believe native code would have following advantages:
- Each random number generator needs to initialize a seed and save it somewhere for next usage. For erlang’s
:rand
module, it’s saved in process dictionary. But for native code, it could be saved at thread level, which means faster access/update, smaller memory usage and better cache hit. Native code would have more advantage when we have a lots of erlang processes. - Native code is faster for it’s computation
Given that :rand module is fairly common used module, maybe, in the future, one day, BEAM core team would consider to create BIFs for :rand module ?
All code / benchmark / scripts are available at https://github.com/gyson/rand_nif.
Thanks for reading