Recently, I’ve been covering my bases by familiarizing myself with the Erlang ecosystem. While reading “Designing for Scalability”, I came across this line:
spawn_link/3 has the same effect as calling
spawn/3 followed by
link/1, except that it is executed atomically, eliminating the race condition where a process terminates between the spawn and the link.
Cesarini, Francesco; Vinoski, Steve. Designing for Scalability with Erlang/OTP: Implement Robust, Fault-Tolerant Systems (Kindle Locations 956-958). O’Reilly Media. Kindle Edition.
Next, I looked through the source code:
I don’t see the “atomic execution”. Could someone explain how that is achieved?
You’ve got the wrong
spawn_link/4 is for spawning a process on a remote node, so the code you’re seeing with net_kernel and call and so forth is about talking to the remote node to tell it to spawn something.
spawn_link/1 for doing
spawn_link(fn -> do_something() end) is really just the same as
spawn_link(:erlang, :apply, [fn -> do_something() end, ]) so what we want is
spawn_link/3 function for doing it
mod/fun/args style is here: https://github.com/erlang/otp/blob/master/erts/preloaded/src/erlang.erl#L1749 however is a NIF, so you’ll need to dig into the C code if you want to go further.
erts/emulator/beam/bif.c which in turn calls
erts/emulator/beam/erl_process.c with the SPO_LINK option to tell it to setup bidirectional links between parent and child. This is the only difference between spawn and spawn_link
There is a reason you actually NEED
spawn_link: without it it is possible for one of the processes to die without the link being set up and the other process not finding out about it. For example with a
link sequence we could get:
pid = spawn(...)
**<EXIT we crash>**
and the link is never setup. Unlikely yes, but it will happen.
We found out about this the hard way!
Good point, this case is also mentioned in Programming Erlang (2nd edition)