tl;dr: I challenge you: let us implement containers for Elixir, and use this topic to coordinate the effort.
Hello, dear fellow Elixir developers.
Recently, multiple discussions (like this and this) have popped up about ‘Elixir not having a built-in data type XXX’.
While for some part we can rely on things that are already part of Erlang (such as :queue
or :array
), there are of course drawbacks to using Erlang modules:
- No Protocol support.
- No documentation (that is part of IEx).
- No introspection, as they do not work on structs (structs are, after all, an Elixir invention).
- Most of these libraries were made before Maps were a thing (Maps were introduced in OTP 17), and using maps we might create more efficient data structures (in space/time) in some cases.
On the other hand, the Elixir standard library only contains the container types Lists, Maps, and MapSets (and Range, but it is clearly the ‘odd one out’). It definitely is not the role of the Elixir standard library to provide built-in datatypes for everything, however. That is something where FOSS libraries can shine.
I myself have until now built Tensor, which introduces sparse vectors, matrices and n-dimensional tensors (collections that allow for amortized constant-time element access/update as they are built on top of maps).
I am currently in the process of creating Okasaki, which is a library containing multiple different versions of Queues and Deques (double-ended queues) with different time/memory properties.
One thing I realized, which is partly why I am starting this topic, is that the Enum and the Enumerable protocol will throw away the structure of the container you had, always returning lists as result.
I therefore am going to implement a common interface, based on the actual properties that are required to perform certain operations.
I challenge you to start implementing container data structures and release them as FOSS libraries. I think it would be wonderful to start a semi-coordinated effort through this forum, to ensure that Elixir gets a nice zoo of data structures that can be used to solve a wide variety of problems.
Suggested Container datatypes (And what libraries are working on them)
(feel free to suggest one)
- Trees: (Many other container types can be built on top of trees)
- Binary trees
- Red-Black trees
- Rose trees
- 2-3 finger trees
- Patricia trees
- Queues: Okasaki tries to implement multiple variants of these.
- FIFO Queues
-
LIFO Stacks (List covers this somewhat, but maybe there are other ways to implement stacks with different guarantees as well? (And then have a unified interface for any kind of stack-like implementation?
)
- Deques (Double-Ended Queues)
- Priority Queues Prioqueue
- Heaps
- Sets: Implemented in Sets
- HashSet (Builtin)
- Tree-based Set
- Multisets and Bags
- Fast Random-Access Sequences
-
Arrays implements some of these:
- MapArray
-
:array
(heap-based array)
- Something akin to Haskell’s Data.Sequence that allows O(1) access to both extremes of the container.
- Immutable arrays (á la Haskell IArray? or Erlang’s
:array
?) - Someting working with DiffArrays?
-
Arrays implements some of these:
- Sparse Vectors/Matrices/Tensors: Tensor
-
Graphs
- Adjacency list
- Adjacency matrix
- Incidency matrix.