Xb5_elixir - B-tree ordered maps / sets / multisets

Hi all,

I’m pleased to announce xb5_elixir, a wrapper of xb5, providing B-tree (order 5) alternatives to OTP’s gb_trees and gb_sets, as well as a multiset with order-statistic operations.

If you need ordered collections, xb5 benchmarks show 1.2–2.5× faster execution compared to the OTP modules for most operations, and equal or lower heap usage.

The three Elixir modules are:

  • Xb5.Bag - ordered multiset with order-statistic O(log n) operations (rank, access by index, percentile)

  • Xb5.Set - ordered set, alternative to gb_sets

  • Xb5.Tree - ordered key-value store, alternative to gb_trees

Performance

The underlying Erlang modules were benchmarked together with gb_sets and gb_trees on two separate computers. All tests ran across three build types, and collection sizes from 0 through 15,000.

Two interactive performance reports are published:

Benchmarking code and raw data are available in the xb5_benchmark repo.


5 Likes

Wow, very good library. I did some benchmarks against maps and found that xb5 is slower than maps for put, but slightly faster for get and many times faster for conversion to ordered list

However, I find separate repo for benchmarks a little overengineered, given that I was able to reproduce your results with Benchee in like 10 minutes or so.

1 Like

Thank you @Asd! Map is generally faster at mutations and lookups - potentially much faster as collection size grows. Trees are useful when it’s a requirement to have the data sorted at all times with scalable performance.

Regarding benchmarks, I did use Benchee for a long time but struggled to get stable performance curves covering a wide range of collection sizes (0 to 15k) for the same test (the graphs on the report), or getting consistent results across runs without very long warm up times (and even then).

I solved this with a custom benchmarking framework that maximizes CPU cache pressure while keeping testing conditions overall uniform for all the cases, and that runs repeatedly until some semblance of stability has been achieved instead of running for a fixed amount of time. To be honest, a lot of the work in this project went into benchmarking alone.

2 Likes

I think that you should add GitHub - discord/sorted_set_nif: Elixir SortedSet backed by a Rust-based NIF · GitHub to your benchmarks. I assume that NIF will be just a little bit faster, which is a huge win for you

Again, congratulations on the initial release. Very solid work. I am definitely going to use it in my projects!

1 Like