InPlace - mutable data structures based on atomics

InPlace is an experimental library that has implementations of several ADTs based on atomics module

In short, the ‘atomics’ API allows operations on mutable arrays of integers.

What is implemented:

  • arrays
  • stacks
  • queues
  • heaps
  • priority queues
  • linked lists

Check out the implementation of X algorithm by Donald Knuth, accompanied by the implementation of Sudoku as an example of usage.

X algorithm is built on the idea of Dancing Links, which relies on the modification of linked lists in place.
The implementation of Dancing Links with immutable data is usually considered to be close to impossible :slight_smile:

5 Likes

lmao this is so unhinged I absolutely love it.

Does anyone know how the lookup from reference => atomic is actually done internally? Is there a table, or something more clever? I feel like you could do some sort of inline caching-style trick.

1 Like

This is awesome to see, although be careful about atomics as they can be slow, I had to completely avoid them for my SOTA pool implementation as they got extremely slow due to how schedulers work in conjunction with processes that can move around freely.

If you can keep your atomics on each scheduler then you can avoid global contention.

2 Likes

I just found out that Donald Knuth had his birthday yesterday.
I have to say that a sizeable part of my motivation to do this project was to have more of Knuth algos in Elixir :-).

2 Likes

Rediscovering Erlang from first principles lol.

But they should be faster than any other way to do shared memory across cores, right? I assume you have a use-case where you want shared memory on each core but not across cores? Like the ETS decentralized_counters trick?

1 Like

They are indeed faster, and my first attempt with atomics was extremely fast, I only saw the issues when I had to push harder for correctness and extreme concurrency where my pool ended up allowing multiple processes to use the same connection at the same time.

So I had to lean into the BEAM for correctness which also reduced cache misses, I could properly have kept the original design and just built a queue, but it got to a place where I had to trade simplicity vs complexity.

All that being said, I want to see more of this, since we can get bare metal performance on the BEAM without writing low level code.

3 Likes

I’m using :atomics in XKS (Hobbes’s storage engine) to synchronize the LSM epoch so the readers can quickly check the current epoch from the single writer without message passing. When they’re done with an epoch they can just message the writer what epoch they’re on and it can GC old tables. I’ll write more about it soon; I think it’s pretty optimal :atomics usage. I don’t see how anything else could be faster given that even ETS will take locks.

But I am interested in mutable data structures (like the OP) because I’ve found that mixing mutable and immutable state leads to bugs. E.g. I had code that would update an ETS table and then return a new struct, and because the mental model was that of mutable state (this is a database) I on multiple occasions forgot to keep the new struct, and this caused very subtle bugs. Thankfully I have aggressive correctness tests, but it’s easier if you don’t write bugs to begin with :slight_smile: I try to store metadata like that in the ETS table itself now so that it can’t get out of sync.

Immutability is great for preventing bugs, but not in every situation.

1 Like

Reference is literally a pointer into the memory to the refcounter. And the atomic is just a refcounted binary

2 Likes

Oh, that’s interesting.

References themselves are heap terms so we have to chase two pointers to get the atomic, correct? Or would that be three with the refcounting?

My understanding of the reference encoding comes from here, but it seems incomplete. I assume the atomics references take up the same space but have a pointer encoded in them? And I assume they must be tagged somehow?

Are other uses of reference overloaded in this way? E.g. an ETS table ref, or monitor ref?

I should really start reading the source :slight_smile: