I noticed that
:rand module (standard library) is a little bit slow when the algorithm I was using is heavily relying on random number generator. I checked the source code and realized that it’s written in erlang instead of native code (e.g. BIFs).
Then I come up with a question:
:randmodule is written in native code, how much performance it could gain ?
Therefore, I wrote a NIF in Rust (via rustler) for
:rand.uniform and made a benchmark comparison between
:rand module and newly created
RandNif module. The benchmark result is as following:
Compiling NIF crate :randnif (native/randnif)... Finished release [optimized] target(s) in 0.06s Operating System: macOS CPU Information: Intel(R) Core(TM) i7-3720QM CPU @ 2.60GHz Number of Available Cores: 8 Available memory: 16 GB Elixir 1.8.1 Erlang 21.2.4 Benchmark suite executing with the following configuration: warmup: 2 s time: 10 s memory time: 5 s parallel: 1 inputs: none specified Estimated total run time: 1.13 min Benchmarking :rand.uniform/0... Benchmarking :rand.uniform/1... Benchmarking RandNif.uniform/0... Benchmarking RandNif.uniform/1... Name ips average deviation median 99th % RandNif.uniform/0 145.35 K 6.88 μs ±192.62% 5.96 μs 14.30 μs RandNif.uniform/1 71.77 K 13.93 μs ±38.32% 12.55 μs 22.28 μs :rand.uniform/0 44.99 K 22.23 μs ±42.47% 20.53 μs 42.35 μs :rand.uniform/1 37.81 K 26.45 μs ±34.67% 24.31 μs 43.62 μs Comparison: RandNif.uniform/0 145.35 K RandNif.uniform/1 71.77 K - 2.03x slower :rand.uniform/0 44.99 K - 3.23x slower :rand.uniform/1 37.81 K - 3.84x slower Memory usage statistics: Name Memory usage RandNif.uniform/0 3.11 KB RandNif.uniform/1 1.56 KB - 0.50x memory usage :rand.uniform/0 11.97 KB - 3.85x memory usage :rand.uniform/1 10.41 KB - 3.35x memory usage **All measurements for memory usage were the same**
As we can see,
:rand.uniform/0 is 3.23x slower than
(I am not very confident with this part. Please correct me if I am wrong.) But NIF is not free, it has some cost as well I believe. Therefore, I made another simple experiment here and measured that the cost of NIF call is about 67% of
RandNif.uniform/0 by comparing a noop NIF function with
RandNif.uniform/0, which may implies that it would be 3x faster if it’s BIF instead of NIF.
:rand.uniform/0 is written in native code (BIF), with my non-scientific estimation, it would be 9.69x faster (3.23 x 3).
Then I had a brief look with source code of erlang’s
:rand module and Rust’s rand module (the one used in NIF) and I believe native code would have following advantages:
- Each random number generator needs to initialize a seed and save it somewhere for next usage. For erlang’s
:randmodule, it’s saved in process dictionary. But for native code, it could be saved at thread level, which means faster access/update, smaller memory usage and better cache hit. Native code would have more advantage when we have a lots of erlang processes.
- Native code is faster for it’s computation
Given that :rand module is fairly common used module, maybe, in the future, one day, BEAM core team would consider to create BIFs for :rand module ?
All code / benchmark / scripts are available at https://github.com/gyson/rand_nif.
Thanks for reading