Elixir processes and no shared heap memory

Elixir processes have their own heap. If a process wants to share a data structure with another process, how could that be possible? One answer that comes to my mind is that the process sends a message to the other process containing the data structure. Does that mean that the entire data structure is copied from one heap to the other? And if this is true, isn’t it inefficient?

Yes and yes, but it gives you guarantees and thus you can do a per process GC instead of stop the world GC.

1 Like

There are trade offs either way. Disadvantages of having individual heaps are, as you note, that you have to copy.

Advantages however are numerous. When a process exits all of its heap can be trivially cleaned up without jeopardizing the heap of any other process. This is both useful (and fast) in intentional exits, and even more useful when it comes to resiliency in the face of unexpected errors.

As mentioned, isolated heaps mean that each process can be GCed independently. Most processes end up with very small heaps, and often these can fit entirely within the CPU cache so that while that process is being worked on all of its memory is readily available and very fast to access.

4 Likes

If a process wants to share a data structure with another process, how could that be possible?

There are cases when sharing data between processes is exactly what you need to do, you can use ETS Erlang -- ets to achieve that.

Reading and writing in and out of ETS still copy the hole datastructures involved.

Just to add a little bit to what’s already been said: the cost of copying is a very transparent cost as opposed to a more subtle cost. So although it may be more expensive you at least know up front that the cost exists which makes it easier to acknowledge and reason about. You might be able to identify places where you are sending large data structures unnecessarily when only a small piece is needed by the receiving process and be able to work around this in an approachable way, whereas dealing with costs that arise from garbage collecting a massive heap can be a lot trickier to deal with.

1 Like

Reading and writing in and out of ETS still copy the hole datastructures involved.

That would depend, if the structure is “structured enough” to have uniform parts under some keys then these parts can be written and read separately, possibly by different processes.

I can’t imagine how this should work, since A reads the data, B alters it, then A starts to process the data, or was even halfway through before B altered it… How is this supposed to work without copying?

Anyway, regardless of some optimisations here, that may only occur under special conditions, it is safe to assume “always copy”. Under this assumption there will be no bad surprises when starting to communicate with processes on other nodes.

1 Like

I can’t imagine how this should work, since A reads the data, B alters it, then A starts to process the data, or was even halfway through before B altered it… How is this supposed to work without copying?

If the data structure is for example a list of structs, each of those can have a separate entry in an ets table and dealt with separately (only needed pieces of data are actually copied around). The whole thing can be adjusted by setting read / write concurrency, if needed serializing access via a single process etc.

Here’s a short read on the topic if you’re interested Yariv's Blog: Erlang does have shared memory

Thanks for the read, I hope I’ll get some time to read through.

For note, you can have a shared structure with no copying, but it has to be static at compile-time, you just bake it into the module source, like say into a function directly or so. There’s even a couple of libraries that can do that compilation at runtime easily so it is like a slow update but super-fast acquire store.

1 Like

But the objects still get copied to and from ets, don’t they?

From the article

Objects are copied when inserted into and looked-up from ets tables.

Yes they do.

Yes. If your data structure consists of many objects you can use ets to only copy the ones that you need, that would be the most of the cases with really big data structures.

In case it’s something that only translates to an ets table with one entry, the whole thing would not make much sense.

2 Likes