ag154

ag154

Process design: Excel-like mutually dependent computations

Hello, I am fairly new to the Elixir ecosystem and I would like to know your opinion on what is the best OTP/process architecture for the following problem:

  1. Imagine a typical excel spreadsheet. There are lots of values in cells, some of the cells contain formulas (which may contain references to other cells, possibly also containing formulas). There is A LOT of cells, think hundreds of thousands to millions. Not all of them need to be “visible” at the same time though.

  2. Any value in cell can eventually change, however the frequency of changes spans several orders of magnitude. Most of the cells will update once per year randomly. Some cells will update once per day or hour, some cells are connected to some external data source and update several times per second.

  3. Assume that all cell formulas are representable as pre-defined parametrised functions (we do not care about formula parsing, UI etc.). However these functions may contain tasks of various complexity, for instance:

  • Combine several values from other cells in a simple math function
  • Compute an aggregation of values in a range of thousands of other cells
  • Take some values from other cells, make an external API call, process, output result
  • Dynamically decide on which other cells are needed for computation, compute a part, re-decide, recompute, output

Overall, the “value” in each cell doesn’t have to be a number, any term will do.

In general, the computation in the system is triggered by a change in value of some cell (typically ones connected to external data), which should trigger a re-computation of all cells that depend on it.

All of this seems to nicely fit into the whole process/functional/messaging paradigm, I have read something about actor model, heard something about C# orleans/grains (but don’t really understand it), and so far I have some ideas on how to design this:

Each cell is represented by genserver, that is created on-demand whenever its value is needed. Every cell should implement a function
compute( incoming_event, subscribed_values_map, state ) -> { list_of_actions, new_state }

During computations, the cell returns a list of actions, where each can be either broadcasting its value, broadcasting an event or subscribing to some other cell.

Above this is a linking layer that supervises all the cells, creates them on demand and handles pub/sub distribution of values and events.

The distinction between event and subscribed_values_map is just a convenience for the cell compute implementation. A subscribed value update is basically a :subcribed_value_changed event that would update internal map and recompute.

My main questions are:

  • This seems like a very generic problem that a lot of people must have been solving and maybe I am just missing the right keyword
  • It seems a bit wasteful to have a separate process for each and every cell, particularily ones with mostly-static data.
  • Aggregations of large numbers of cells seem to be hard, that’s why events are there, to support for “item added/updated/removed” messages, so that the aggregation cell can just update the value without need to reaggregate

Thanks in advance for any ideas, keywords and suggestions.

Most Liked

derek-zhou

derek-zhou

Mutually dependent computation leads to chaos. I would make sure the dependency is a tree, with no loops. Computation is then straightforward. To model loops, I would add state holding registers, to postpone some changes to the next round.

Simulation of a complex system is fun; abusing the actor model to have too many actors is not. Start from a single thread program just loop through all tree nodes. If you need parallelism for speed, you can partition the tree later.

al2o3cr

al2o3cr

Performance is certainly a concern, but there’s a bigger problem: how do you read a consistent state out of a spreadsheet where every cell is a separate process? What if some of the cells are doing a long-running operation? It’s possible, with additional machinery, to do this - but it winds up looking like a Paxos knockoff because it is.

I’d recommend focusing on the pure-function part you mention above (compute etc). Then those functions can either be composed together sequentially (iterate over a list of changes to produce new changes, repeat until fixed point) or concurrently (split into communicating processes), but that’s an implementation detail.

Where Next?

Popular in Questions Top

marius95
Hello everyone, I try to use an Javascript Event Handler in my root.html.leex file. Therefore I created a function in the app.js file: ...
New
_russellb
I want to try my hand at web scraping. What tools/libraries do I need to use. I’m hoping to turn this into something professional so don’...
New
qwerescape
Is there a way to get the call stack or stack trace at any point in the code? Not from exceptions, but an expression that returns how the...
New
fireproofsocks
I’m working on defining a simple Ecto schema for a table (in PostGres), but I don’t see where I can define a column as NOT NULL. Conside...
New
tduccuong
Hi, is there any work on GUI with Elixir, that is similar to Electron/Javascript? My idea is to bundle Phoenix and BEAM into a single se...
New
jononomo
I am trying to figure out how Mix knows whether the environment is test, dev, or prod – where is this set? Thanks.
New
dokuzbir
I want to highlight html closing tags when i click a html tag. That works in .html files but doesnt work for html.eex templates. How can...
New
vonH
When I run the Plug and I recompile I wind up having to use Ctrl C to quit iex and start again. Witht the help of rlwrap I can use the cu...
New
ashish173
I am using Ecto timestamps with postgres, I can see the timestamps() use the :naive_dateime but for my use case I wanted to store the ti...
New
PeterCarter
There are pre-rolled solutions for other frameworks that do work. However, Phoenix does not seem to have these. Have people had good expe...
New

Other popular topics Top

siddhant3030
Hi, I have to write a raw query for one of my project. But till now I have used ecto queries and don’t have much experience writing raw ...
New
TunkShif
This post is an instruction guide to help you setup your Neovim for Elixir development from scratch. It includes general information on h...
274 41539 114
New
shahryarjb
Hello, I have map which I want to convert it to string like this: the map: %{last_name: "tavakkoli", name: "shahryar"} the string I ne...
New
stefanchrobot
What’s the safe way to decode a JSON string into a struct? I want to avoid calling String.to_atom. Jason.decode can give me a map with st...
New
jay1
Why is it that the mnesia database isn’t the most preferred database for use in Elixir/Phoenix?
New
stefanluptak
Hello everybody, usually, I use a 29" ultra-wide monitor for VSCode which can easily accomodate explorer (files panel) + file with code ...
New
Emily
I have VueJS GUIs with the project generated using Webpack. I have Elixir modules that will need to be used by the VueJS GUIs. I forese...
New
nsuchy
Hi. I’ve noticed that Windows Powershell has it’s own IEX command and you cannot access Elixir’s IEX due to the conflict. This isn’t a cr...
New
openscript
Hello! Sorry for this astonishing simple question, but I’m really stuck. I try to set up the intellij-elixir plugin, but I don’t know ho...
New
lanycrost
Hi everyone! I need implement if…else if…else condition from my elixir code, and anymore of this control flow structures not work proper...
New

We're in Beta

About us Mission Statement