Fast Mergeable Integer Maps: Maybe interesting to port this to Elixir?

Hello, everyone.

At the University I am regularly working in Haskell to make certain exercises for the Advanced Algorithms and Datastructures cource. A few days ago, I came across Haskell’s Data.IntMap and Data.IntSet containers. These turn out to be ridiculously fast, running most operations (insert, find, delete, merge) in O(min(n, W)) time, where W is 32 for 32-bit machines and 64 for 64-bit machines.

This sounded very intriguing, since it goes against the understanding I had thus far about Hashmaps, which is that they are usually implemented as Red-Black trees (or similar auto-balancing binary trees) or Finger trees (I believe this is what Erlang’s maps are built on?), which take O(log(n)) time for these operations.

Haskell’s IntMap and IntSet, however, are built on so-called Patrizia trees, which store, in their nodes, the binary representation of (a part of the prefix of) the given key. The ‘drawback’ is of course that you cannot have keys larger than 2^W, but for many applications, this is perfectly fine.

The original paper called Fast Mergeable Integer Maps, written by Chris Okasaki and Andy Gill can be found here.
I don’t have the time in the near future to build an implementation of these in Elixir, but I think it would be an intriguing project, since I expect that maps/sets built on these kinds of trees will be faster than Elixir/Erlang’s built-in maps (even though the latter can be optimized by the BEAM), because of the speed that restricting the key type will bring.

I wanted to share this here because I expect that I’m not the only one who finds this interesting, and maybe someone would like to port this to Elixir :slight_smile: .

10 Likes

Porting a solved problem you already understand is a wonderful exercise to learn more about elixir.

Having a goal like such a library available on hex afterwards will always be a benefit in the future.

How about you porting it to elixir? :slight_smile:

2 Likes

I might have a go at it myself, but as I already have ten different packages on Hex that I try to maintain and on the side make long days at the university and at work, I expect to have little time to do so in the near future. So if someone else wants to hack away on this: please go ahead! :smiley:

4 Likes

I would love to look at it, but i am a bit busy too. Additionaly, to really use Patrizia trees properly, i am affraid we would need to go down into NIFs. Otherwise you would need to use binaries to implement them.

Or i am missing something.

3 Likes

I have an implementation idea in my mind, I’m not quite sure how efficient it might be, it would be pure elixir though…

It’s a pity that I have a filled schedule currently, so I can’t implement it right away. I do hope I can get up a POC up and running this weekend…

2 Likes

I believe you can just use Integers and use Bitwise.band to check if the desired bit of the given node is 1 or 0, which would be faster than binaries (And hopefully the BEAM can optimize this to use word-sized ints).

2 Likes