Where can I find an explanation of closures on the BEAM

In several languages closures capture what’s in scope when they’re defined and then see changes to those variables. This is a simple example in Ruby, but it works the same way in JavaScript, Go and others:

# this is Ruby! Not Elixir!
a = "hello"
f = proc { puts a * 2 }

f.()
# prints hellohello

a = "nope"
f.()
# prints nopenope

In Elixir, closures capture what’s in scope when they’re defined but they don’t see changes to those variables. For example:

# this is Elixir
a = "hello"
f = fn -> IO.puts(a <> a) end

f.()
# prints hellohello

a = "nope"
f.()
# prints hellohello

Now, I know that a = "nope" is:

  • a pattern match operation, not assignment
  • it’s rebinding the variable, not modifying the value itself

And I think I can make sense of the behaviour.

I’m wondering how it works under the hood though.

I don’t think that is has much to do with the fact that data types are immutable. Some Ruby objects are immutable too, and a = "nope" in Ruby and other imperative languages doesn’t modify anything, it just assigns a new value to the variable. So I think this has more to do with scoping rules rather than immutability. Or possibly with how values vs references are handled.

I know that in Erlang variable rebinding is not possible, so I’m guessing that this is the BEAM not allowing a fn to see the new value of the variable. Possibly because of how Elixir implements rebinding of variables.

If anyone could explain how this works and point me to some documentation it would be great.

Thanks!

p. 59: Elixir in Action 2e:

A closure always captures a specific memory location.

So the function’s a is bound to the initial memory location. When you change the binding of the outside a to a new memory location the function’s a is still bound to the initial memory location. And the value at that location isn’t going to change due to immutablility.

6 Likes

that elixir code is slightly misleading because the beam vm has no idea the two a bindings share a name. to the vm it looks like this:

# this is Elixir
x = "hello"
f = fn -> IO.puts(x <> x) end

f.()
# prints hellohello

y = "nope"
f.()
# prints hellohello
3 Likes

I compiled this Elixir code

# this is Elixir code
  def test do
    a = "hello"
    f = fn -> IO.puts(a <> a) end

    f.()
    # prints hellohello

    a = "nope"
    f.()
    # prints hellohello
  end

and extracted the abstract Erlang code from the .beam file and pretty-printed it:

# This is Erlang code reconstructed from the compiled abstract code
test() ->
    _a@1 = <<"hello">>,
    _f@1 = fun () ->
                   'Elixir.IO':puts(<<_a@1/binary, _a@1/binary>>)
           end,
    _f@1(),
    _a@2 = <<"nope">>,
    _f@1().

As you can see, the Elixir variable a is compiled into two different variables: _a@1 and _a@2.

3 Likes

Elixir doesn’t have variables, it has bindings, maybe that’s what you meant? Variables are mutable, bindings are not. :slight_smile:

In the backend it’s basically transforming this:

a = 1
fn -> a end
a = 2
fn -> a end
a = 3

Into this:

a@0 = 1
fn -> a@0 end
a@1 = 2
fn -> a@1 end
a@2 = 3

I.E. each ‘rebinding’ gets a new and unique name. You can see this by decompiling elixir output. :slight_smile:

Oh hey, yeah exactly what @alco shows here (he’s is in Erlang syntax though because disassembled), but similar enough)!

1 Like

To extend a bit on what @alco and @OvermindDL1 have said with some more internal detail.

So as it has already been pointed out Erlang/Elixir (an many other functional languages) doesn’t really have variables in the classic imperative/oo meaning. So a classic variable is basically a hole and the name is a reference to that hole so so reassigning a variable fills the hole with a new value which will be seen by everyone. A functional variable is just a reference to data.

Also in Erlang/Elixir all data is passed by value so when you make a function you are logically passing the value into the new function not a reference to the data. [*] The same happens when you create a closure. So all data the closure references through variables and arguments is “copied” into the closure and the code references that data.

[*] While this may sound inefficient with lots of copying of data it is only logically passing the data by value. As all data, and I mean ALL data, is immutable I can get the same effect as passing by value by passing in references. The data cannot be changed.

Final note: when Elixir allows rebinding variables a = {1,2} ; a = {3,4} it easily gives the erroneous impression that they behave like “normal” imperative variables, which they don’t. They are just refererences to the data.

2 Likes

Thank you all for the answers.

I thought it was because of how variable rebinding is implemented in Elixir vs how it’s handled in the BEAM – I assumed it was a sort of facade in front of how the VM sees them – and all the examples made it very clear.

Also thank you for the explanation of the difference between “binding” and “variable assignment”. That really helped me understand better how it’s different from other languages.

2 Likes

Actually binding is a fairly common thing in languages, like take the whole prolog or ml style languages, or haskell or many others too, even Rust and Kotlin and Scala and more, like here is OCaml ((*...*) is a comment):

let a_binding = "Hi" in (* bind a_binding to string *)
let a () = print_endline a_binding in
let a_variable = ref "Bye" in (* make string variable bound to name a_variable *)
let b () = print_endline !a_variable in
let () = a () in (* print "Hi" *)
let () = b () in (* print "Bye" *)
let a_binding = "Yo" in  (* rebind a_binding, like in elixir *)
let () = a_variable := "Vwoop" in (* Redefine the variable pointed to by name a_variable *)
let () = a () in (* print "Hi", no change from prior *)
let () = b () in (* print "Vwoop" *)
let a_variable = ref "Blorp" in (* Rebind a_variable *)
let () = b () in (* print "Vwoop", notice it used the old binding to the variable *)
()

And running that in the ocaml REPL outputs:

Warning 26: unused variable a_binding.
Warning 26: unused variable a_variable.
Hi
Bye
Hi
Vwoop
Vwoop
- : unit = ()

Languages with just variables without bindings conflate them together. C++ differs them the other way (something like int blah makes a variable and const int blah makes a binding). Rust does it like OCaml but without needing the variable setter syntax so let x = 42 is a binding and let mut x = 42 is a variable. Etc… etc… etc…

In the Elixir world an OCaml style ‘variable’ would probably just be an Agent:

a_binding = "Hi"
a = fn -> IO.puts(a_binding) end
a_variable = Agent.start_link(fn -> "Bye" end)
b = fn -> IO.puts(Agent.get(a_variable, & &1)) end
a.() # Hi
b.() # Bye
a_binding = "Yo"
Agent.update(a_variable, fn _ -> "Vwoop" end)
a.() # Hi
b.() # Vwoop
a_variable = Agent.start_link(fn -> "Blorp" end)
b.() # Vwoop

Which in iex does:

╰─➤  iex
Erlang/OTP 21 [erts-10.1.1] [source] [64-bit] [smp:6:6] [ds:6:6:10] [async-threads:1] [hipe]

Interactive Elixir (1.7.4) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> a_binding = "Hi"
"Hi"
iex(2)> a = fn -> IO.puts(a_binding) end
#Function<20.128620087/0 in :erl_eval.expr/5>
iex(3)> {:ok, a_variable} = Agent.start_link(fn -> "Bye" end)
{:ok, #PID<0.109.0>}
iex(4)> b = fn -> IO.puts(Agent.get(a_variable, & &1)) end
#Function<20.128620087/0 in :erl_eval.expr/5>
iex(5)> a.() # Hi
Hi
:ok
iex(6)> b.() # Bye
Bye
:ok
iex(7)> a_binding = "Yo"
"Yo"
iex(8)> Agent.update(a_variable, fn _ -> "Vwoop" end)
:ok
iex(9)> a.() # Hi
Hi
:ok
iex(10)> b.() # Vwoop
Vwoop
:ok
iex(11)> {:ok, a_variable} = Agent.start_link(fn -> "Blorp" end)
{:ok, #PID<0.118.0>}
iex(12)> b.() # Vwoop
Vwoop
:ok

So as you can see the binding/variable distinction is actually pretty common in languages, some focus more on the binding (elixir, ML’s, Haskell, Rust, Scala, Kotlin, etc…) and some focus more on the variable (C++, Java, Ruby, etc…), and that choice often drives a lot of the language design otherwise. Overall, and the reason why most ‘new’ languages are choosing to focus on the bindings, is that bindings tend to be a LOT more maintainable and easier to reason about as you always know it’s value and that it won’t ever change once set. :slight_smile:

2 Likes

Is it better if I say “…from other languages I’m familiar with”? :slight_smile:

2 Likes