PersistentVector - An array-like collection of values indexed by contiguous 0-based integer index

Persistent Vector for Elixir

PersistentVector is an array-like collection of values indexed by contiguous 0-based integer index.

PersistentVector optimizes the following operations:

  • Get element count
  • Lookup element by index
  • Update element by index
  • Adding new element to the end
  • Removing element from the end
  • Enumeration

Get count operation is O(1), most others are O(log32(N)).

PersistentVector is implemented as a trie with 32-way branching at each level and uses structural sharing for updates.
All ideas are borrowed directly from Clojure, yet the implementation (and all the bugs) are my own.

Benchmarks

Perf-wise, the sweet spot for PersistentVector is scenarios when vector needs to be built by repeatedly appending to the end AND random-access (get/set operations) are also used.

If random-access after building is not required, then building and reversing a List is more efficient.

If building speed is not important, but removal from the end happens often, then Erlang’s :array shows better performance.

get/set operations perform similar to :array:

  • PersistentVector.get/2 is slightly faster for larger collections compared to :array.
  • PersistentVector.set/3 is slightly faster for smaller collections compared to :array.

More info

See Full Documentation

See Benchmarks

8 Likes

Hey, this looks cool! Have you had a chance to benchmark it against a simple map where the keys are indices? Maps are also a trie underneath, I’d be curious to see how it does.

Hi, I’ve added perf comparison to Map with the following qualification:

Map is added only for a baseline. In a sense that if Map was to outperform PersistentVector then this library would not be needed.
This comparison is not fair to Map as it has much richer capabilities.
The fact that Map performs worse for bigger collections is not surprising and is not Map’s fault :-).

Short version is: PersistentVector outperforms Map for larger collections for most operations. And of course it has smaller memory footprint.

Full resutls at the same location: Benchmarks.

1 Like

Huh, that’s impressive considering map’s trie is implemented in C. :slight_smile:

1 Like

Thanks. As often happens, algorithmic efficiency outweighs low-level optimizations.

Another example: I’ve re-implemented Enumerable in v0.1.2, and now it’s 1.9 times faster than Map (it used to be 1.7 times slower). This is a 3x speed up. My previous attempt at low-level Enumerable optimizations gave me less than 10% improvement.

3 Likes