JordiPolo

Performant data structure for a table (think 2D array or list of lists)

I am writing a library to deal with dataframes. Dataframes can be thought as excel tables. The main data is a 2D table and any operation can be row based or column based.
Operations include filtering by column, row, adding or deleting rows, columns, etc.

Right now, in a very naive way, I’ve implemented the main data as a list of lists, row based. But of course this makes column operations expensive (add a new element in a certain position to each list for instance).

If I switch to list of list, column based, then row operations would be expensive.
Tuples would not make it I believe because they may get huge and there may be lots of modifications
and Maps are not ordered, and I need order.

Any advise? Are out there any comparison of performance of each of the Elixir data structures?

17 comments

#performance

19 5230 17

2016-10-31 14:38:38 UTC

Most Liked

Qqwy

TypeCheck Core Team

I think you might want to look at the Tensor library that I made a while back, which allows for vectors, matrices and any higher-order tensors as well. It works with sparse maps, making it more efficient when it is only partially filled. It probably provides the functionality you want (a multitude of different ways of filtering, adding/removing rows or colums, flipping or rotating, fast access to a single element, row or column, easy ways to sort rows or columns, etc).

It is of course also available on Hex.pm.

Post #12

NobbZ

Extracting a single row or column I think is most efficient by using Enum.filter/2. This would make extracting a single column or row O(row * colums) though.

If you really need O(1) access to a complete row or column randomly, I do think, that a doubled DS would be the only thing to go, but it does double the amount of memory needed to hold the structure, also it makes writes expensive since they had to be done twice.

Basically you need to two nested maps %{row_id => %{col_id => data_sample}} and %{col_id => %{row_id => data_sample}}.

Post #4

benwilson512

Author of Craft GraphQL APIs in Elixir with Absinthe

Yeah the duplication is definitely a possibility. Fortunately this is less duplication than it may look at first glance because cell values are just pointed to by each map location, not actually duplicated.

:ets could be another possibility because it does support a {_, 1} style match operation.

FWIW even traditional mutable 2d arrays suffer from this question. If you make row access primary then column access requires large jumps in memory which eats away at any cache locality you may have gotten with row access.

Post #5

Where Next?

View thread on forum (has 17 responses!)

performance

Home Chat & Discussions>Discussions

#performance

19 5230 17

Last post

Performant data structure for a table (think 2D array or list of lists)

JordiPolo

Performant data structure for a table (think 2D array or list of lists)

Most Liked

Qqwy

NobbZ

benwilson512

Where Next?

Popular in Discussions

How I upgraded to Elixir v1.4.0 on Ubuntu

Elixir/Erlang is Faster than Optimized Rust(tokio) in Message Passing

What to learn first - Rust or Elixir?

Does my frustration with Node merit switching to Elixir?

Should we adopt Dave's way of building applications as a series of components? (Dave's talk has now been added!)

Charts for live view

Low-cost, no-fuss Elixir app hosting?

The complexity of Haskell vs. Elixir's simplicity

How to think about pattern matching vs. type-checking

Drab and Liveview community oddities

Other popular topics

Write while loop equivalent in elixir

Phoenix 1.4.0 released!

Using List.first instead of Enum.at(0)

How to decode a JSON into a struct safely?

Why isn’t mnesia the most preferred database for use in Elixir/Phoenix?

Upgrading Elixir - how to check versions, delete, and upgrade?

How to get the current URL?

Pattern matching against a string

IntelliJ Elixir - Elixir plugin for JetBrain's IntelliJ Platform

Hex version - ** (Mix) The task "phx.new" could not be found

Chat & Discussions>Discussions

Latest on Elixir Forum

Sponsor Spotlight

Our Sponsors

Categories:

Sub Categories:

Forums

Popular Tags

Our Sponsors

We're in Beta