What is ādistributionā?
That is an important question as across this sprawling thread there are so many different uses of that term.
Does it mean to spread code around a bunch of VMās that are able to address each other? Is it a means of explicitly defining communication paths between instances of an application? Is it about application models? (As in: what makes an āapplicationā; where are the dividing lines ā¦ which is not as trivial a question as it may seem at first ā¦) Is it distributing individual computations / computational units across multiple systems? Is it doing that without configuration / elastically? Is it about sharding and replicating data so that it is later retrievable from arbitrary other ālocationsā in the application? ā¦
These are not the same things, and as such they need different sets of solutions. Kubernetes, riak-core, RAFT (p.s. check out CASPAX if you are into RAFT) ā¦ they are (potentially) complimentary solutions that address very different parts of ādistributionā.
When people come together to talk about a topic but have different usages of the same word, it is unrealistic to expect useful results (as in: something that can resemble and actionable vision / plan) to emerge.
Probably requires different solutions for different scales, and certainly different aspects of distributed computing. I would worry less about winning the race upwards, and think more about compatibility between the various solutions: the interfaces and protocols that may bind them together when used to a single purpose.
In some cases, yes. In other, absolutely not. For the āyesā cases, it is worth developing and making easy to accomplish. It should also be easy to define another application topology (which is what this question is about) without moving to an entirely foreign(-feeling) solution.
The real questions
Between nodes? Not really. term_to_binary remains a horrible bottleneck. I am running ERTS 20.0 and Elixir 1.6.6 and we continue to have message that nodes simply can not even serialize (!) because of the inherent inefficiencies in the external term format. I actually came to the forum this morning to post about something Iāve been working on that we will be using here at work to work around this specific issue.
These issues still donāt address ādistributionā ā¦ they do, importantly, make it more possible to build those solutions.
IMHO, the minimum working start point is a system that provides:
-
A coherent, built-in, and common-tasks-easy/uncommon-needs-possible deployment solution with distribution in mind; something that can work next to docker containers, but which is not (because we have something much better already with releases; docker is for applications that were built for the OS-centric model)
-
A distributed configuration system that is not based on file system artifacts extant on the nodes of a cluster
-
āZeroā-configuration cluster building that works with gossip, kubernetes, AWS, ā¦ etc. There are a few Elixir/Erlang implementations of such things. But in the end, the cluster configuration can not realistically be shipped with the application even as configuration
-
Service discovery within and across clusters
-
An implementation of a performant, reliable, and easy-to-use consensus system (recommend CASPAX, for which there is a start of an Elixir implementation; the author seems to not have the time to continue it, as my 4 pull requests have sat for 2 months now. We exchanged emails and he just seems exceedingly busy ā¦)
-
An efficient message passing system, which probably means one approach for small messages and another (side-channel?) for more complex/large structured data
-
A definition of, not necessarily an implementation of, how application units [can|should] be arranged. āMicroservicesā, āserverlessā (winner of āmost ridiculous name in computingā) and ācloud computingā are so poorly defined in the first place and do not map cleanly to how the BEAM or applications written on it can be (should be? ought to be? are?) implemented and distributed
Without that, it is really hard to get started on the, admittedly perhaps more interesting, higher-level parts such as data storage and retrieval on distributed systems ā¦