Need help understanding memory allocations in this code

dan-cooke · October 10, 2024, 5:34pm

Hi all,

I’m just learning elixir for fun , and I would love to understand a little bit more about the memory allocations that are made during the following code snippet taken from the Mix OTP tutorial:

defp loop_acceptor(socket) do
  {:ok, client} = :gen_tcp.accept(socket)
  {:ok, pid} = Task.Supervisor.start_child(KVServer.TaskSupervisor, fn -> serve(client) end)
  :ok = :gen_tcp.controlling_process(client, pid)
  loop_acceptor(socket)
end

Heres what I am thinking so far:

loop_acceptor function is called and socket is placed onto the stack for this function call
:gen_tcp.accept(socket) is called which creates a socket resource called client, this is currently owned by the process that called loop_acceptor
Task.Supervisor.start_child is called, the anon function creates a closure which captures the serve function and the client.

Thats where I get stuck.

Does this closure which runs in a new process result in the variables it captures being “copied” into the new process heap/stack?
If so… how is it possible to “copy” a resource like a socket without creating multiple OS sockets for the same connection?
As the serve function belongs to the surrounding KVServer module does this mean that the supervised task gets a copy of the KVServer module too?

If anyone has any good documentation to learn more about BEAM memory, or perhaps some tips on how I can debug these allocations myself.

I’m very green with BEAM

Any help is greatly appreciated, thanks!

derek-zhou · October 10, 2024, 6:04pm

The authoritative source of BEAM internal is the Beam book:

https://blog.stenmans.org/theBeamBook/

It really depends on how deep you want to go. From the 10,000 foot above, you can consider all values are allocated on the per-process heap and are immutable. So there will not be cyclic references, GC is straight forward, and closure is easy to implement.

Of cause there are exceptions, but the native code hide the ugly parts and maintain the happy illusion.

al2o3cr · October 10, 2024, 10:31pm

#1: yes, the closure captures its environment. Rebinding client after the call to Task.Supervisor.start_child won’t change the value the child process sees.

#2: the thing that’s returned from :gen_tcp.accept is usually a “port”; short short version it’s a way to name the thing that’s actually holding the OS socket. Calling controlling_process tells the port that it should send messages etc to the newly-started task instead of the process that originally called :gen_tcp.accept.

#3: there’s no state attached to the module KVServer, so asking whether there’s a second “copy” of it isn’t particularly meaningful. There’s some very tricky corner-cases around hot-code reloading that make calling serve(client) different from calling KVServer.serve(client) (even from inside a function defined in KVServer!) but most people will never encounter them.