Question about spawn/1

mattfara50 · June 18, 2022, 2:29pm

I’m reading through Elixir in Action. In the chapter of concurrency primitives, it mentions that when calling send/2, the second argument, as a term, gets shoved into the receiving processes’s mailbox. So here’s the Q: when you run spawn/1, a new process is created, and the argument is a 0-arity lambda. Is the lambda term also placed into the new processes’s mailbox, like a startup message?

Secondarily, how important is understanding these low-level details for the day-to-day programming of Elixir? I’m just starting up, so I’d rather be useful before I’m knowledgeable, if that’s possible.

lud · June 18, 2022, 2:48pm

I guess it depends on the labmda. If it has closures ( fn -> do_stuf(some_var) end), the they are copied into the new process. If the lambda is like &Module.fun/0 then it will not be copied because the new process can point directly to the code. If the lambda does not have closures ( fn -> do_stuff() end ) then I don’t know.

I think you must know how processes and messages work on the code level. On the VM level (that is, what is copied and what is not), is not so important for day-to-day programming. It will become more important when you will want to optimize (tipically by avoiding copying too much data between processes when sending messages, or spawning, or providing data to supervisor child specs).

Oh and welcome to Elixir !

Edit: sorry I did not understand your question correctly: some stuff is copied to the process memory space, but the lambda is not itself sent as a message in any case. The lamba reprensents the code that the process will run. If the lambda was sent as a message, what code would be run to receive it? (though it could be both executed and sent but that is not the case).

Regarding copying data between processes, the general rule of thumb is that you want to copy references to things (file names, IDs, names) instead of the data itself. But do not think about it too much, copying is fine if you just need it, or if it makes the code much simpler.

amarandon · June 18, 2022, 2:50pm

Is the lambda term also placed into the new processes’s mailbox, like a startup message?

Let’s find out:

iex(12)> pid = spawn fn -> :timer.sleep(10_000); IO.puts("Messages I've got so far:"); flush(); end                                                        
#PID<0.130.0>
iex(13)> send pid, :hello                                                                          
:hello
Messages I've got so far:
:hello

So the answer is no
flush is useful to know what’s in the inbox of the current process.

how important is understanding these low-level details for the day-to-day programming of Elixir?

If you’re going to do any concurrent programming with Elixir, it’s very important to go through these details to build a solid mental of how processes work in Elixir. If you just want to build a traditional webapp with Phoenix, you can probably ignore them for now.

caleb-bb · June 18, 2022, 3:47pm

No, the lambda does not go into the new process’ mailbox. It is, rather, executed right away. The purpose of the mailbox is communication between processes. The purpose of the lambda is to basically define what the process created by spawn/1 actually does. So that lambda is a pure function. The way you make it dynamic is by writing a function that dynamically defines the lambda when it runs and passes that lambda to spawn/1. In Elixir In Action, they define async_query/1 immediately after they introduce spawn/1 (pg. 137 of my copy). The async_query/1 takes an argument that winds up being part of the lambda definition, which is how you have a pure 0-arity function that can nonetheless be dynamic (because dynamically defined).

Think of it like this: you have to have shared mutable state in order to really do anything with a computer program. But, shared mutable state is a huge source of problems. OOP deals with this by locking state in objects. Functional languages (like Elixir) tend to have immutable data structures, which is nice and clean and mathematical. But when the time comes to do things in the real world, you need mutable state. The purpose of having things like genservers and message passing is to give you a way to deal with mutable state while still remaining functional; instead of holding state in an object, you hold it in a genserver process. It’s actually kind of similar to OOP in that respect.

Do you need it for a simple LiveView webapp? Not at the beginning, no. As time goes by, though, it will help to know the underlying OTP voodoo because that’s what LiveView is built on. It will pay off because you’ll be able to figure things out more quickly as a result of understanding the underlying theory.

If, on the other hand, you’re doing stuff that actually requires knowledge of concurrency, e.g. a IOT app processes a gazillion messages per second, then yeah, knowledge of concurrency fundamentals is important.