(forking off from this thread because this is 1000% off-topic from where that thread started)
Here, I’m just curious about something. I wonder wether there is a performance gain between getting “sub-parameters” in a function head instead of doing it inside its body.
And more specifically:
a = param.a
b = param.b
vs
%{a: a, b: b} = param
So I got curious about this, and it’s quite a journey through BEAM assembly-land. The BEAM Book will be valuable to understand what’s going on here.
This module will be our testbed to see what these different styles of parameter access compile to:
defmodule Foo do
@compile :S
def in_args(%{a: a, b: b}) do
IO.inspect(a)
IO.inspect(b)
end
def in_body(%{} = params) do
a = params.a
b = params.b
IO.inspect(a)
IO.inspect(b)
end
def in_body_exact(%{a: _, b: _} = params) do
a = params.a
b = params.b
IO.inspect(a)
IO.inspect(b)
end
def in_body_short(%{} = params) do
IO.inspect(params.a)
IO.inspect(params.b)
end
def in_body_no_guard(params) do
IO.inspect(params.a)
IO.inspect(params.b)
end
end
The @compile :S
attribute tells the compiler to emit a .S
file and then give up. Putting this code in a file foo.ex
and then compiling it with elixirc foo.ex
will produce an error from the compiler:
$ elixirc foo.ex
== Compilation error in file foo.ex ==
** (CompileError) foo.ex: could not compile module Foo. We expected the compiler to return a .beam binary but got something else. This usually happens because ERL_COMPILER_OPTIONS or @compile was set to change the compilation outcome in a way that is incompatible with Elixir
That last part was us There should be a foo.ex.S
file now.
Skimming past the module metadata etc, we find the first function (annotations inline):
{function, in_args, 1, 10}.
{label,9}.
{line,[{location,"foo.ex",4}]}.
{func_info,{atom,'Elixir.Foo'},{atom,in_args},1}.
{label,10}.
# checks the shape of the input, bails with a FunctionClauseError if not a map
{test,is_map,{f,9},[{x,0}]}.
# loads {x,1} and {x, 2} with a and b, bails with a FunctionClauseError if not found
{get_map_elements,{f,9},
{tr,{x,0},{t_map,any,any}},
{list,[{atom,b},{x,2},{atom,a},{x,1}]}}.
# some register/stack shuffling. The input map in {x, 0} is no longer needed
{allocate,1,3}.
{move,{x,2},{y,0}}.
{move,{x,1},{x,0}}.
{line,[{location,"foo.ex",5}]}.
{call_ext,1,{extfunc,'Elixir.IO',inspect,1}}.
{move,{y,0},{x,0}}.
{line,[{location,"foo.ex",6}]}.
{call_ext_last,1,{extfunc,'Elixir.IO',inspect,1},1}.
Farther down is in_body
:
{function, in_body, 1, 12}.
{label,11}.
{line,[{location,"foo.ex",9}]}.
{func_info,{atom,'Elixir.Foo'},{atom,in_body},1}.
{label,12}.
# same as in_args
# checks the shape of the input, bails with a FunctionClauseError if not a map
{test,is_map,{f,11},[{x,0}]}.
# loads {x, 1} and {x,2} with a and b, with different error destinations
{get_map_elements,{f,14},
{tr,{x,0},{t_map,any,any}},
{list,[{atom,a},{x,1}]}}.
{get_map_elements,{f,13},
{tr,{x,0},{t_map,any,any}},
{list,[{atom,b},{x,2}]}}.
# identical code until label 13
{allocate,1,3}.
{move,{x,2},{y,0}}.
{move,{x,1},{x,0}}.
{line,[{location,"foo.ex",12}]}.
{call_ext,1,{extfunc,'Elixir.IO',inspect,1}}.
{move,{y,0},{x,0}}.
{line,[{location,"foo.ex",13}]}.
{call_ext_last,1,{extfunc,'Elixir.IO',inspect,1},1}.
{label,13}.
# error handling for when b is missing
{test_heap,4,1}.
{put_tuple2,{x,0},{list,[{atom,badkey},{atom,b},{x,0}]}}.
{line,[{location,"foo.ex",11}]}.
{call_ext_only,1,{extfunc,erlang,error,1}}.
{label,14}.
# error handling for when a is missing
{test_heap,4,1}.
{put_tuple2,{x,0},{list,[{atom,badkey},{atom,a},{x,0}]}}.
{line,[{location,"foo.ex",10}]}.
{call_ext_only,1,{extfunc,erlang,error,1}}.
The code difference here is because in_args
and in_body
have different behavior for badly-shaped input:
iex(2)> Foo.in_args(%{})
** (FunctionClauseError) no function clause matching in Foo.in_args/1
The following arguments were given to Foo.in_args/1:
# 1
%{}
iex:4: Foo.in_args/1
vs
iex(2)> Foo.in_body(%{})
** (KeyError) key :a not found in: %{}
iex:10: Foo.in_body/1
in_body_exact
is a hybrid of the two approaches:
{function, in_body_exact, 1, 16}.
{label,15}.
{line,[{location,"foo.ex",16}]}.
{func_info,{atom,'Elixir.Foo'},{atom,in_body_exact},1}.
{label,16}.
# same as the other two, gives a FunctionClauseError on non-maps
{test,is_map,{f,15},[{x,0}]}.
# only loads {x,1} with a, but gives FunctionClauseError for a missing key
{get_map_elements,{f,15},
{tr,{x,0},{t_map,any,any}},
{list,[{atom,a},{x,1}]}}.
# raise a FunctionClauseError if b is missing
{test,has_map_fields,{f,15},{x,0},{list,[{atom,b}]}}.
# load {x, 2} with b, with error handling like in_body
{get_map_elements,{f,17},
{tr,{x,0},{t_map,any,any}},
{list,[{atom,b},{x,2}]}}.
# same function body as before
{allocate,1,3}.
{move,{x,2},{y,0}}.
{move,{x,1},{x,0}}.
{line,[{location,"foo.ex",19}]}.
{call_ext,1,{extfunc,'Elixir.IO',inspect,1}}.
{move,{y,0},{x,0}}.
{line,[{location,"foo.ex",20}]}.
{call_ext_last,1,{extfunc,'Elixir.IO',inspect,1},1}.
{label,17}.
{test_heap,4,1}.
{put_tuple2,{x,0},{list,[{atom,badkey},{atom,b},{x,0}]}}.
{line,[{location,"foo.ex",18}]}.
{call_ext_only,1,{extfunc,erlang,error,1}}.
I believe the error handling code in label 17 is unreachable because of the test in the preceding instruction; perhaps a subsequent optimization pass will spot that.
Digging in the source may yield more insight; I could see an optimization for the “match a single field from the map” being useful since patterns like def some_func(%Domain.Struct{} = s)
are common.
The only difference between in_body
and in_body_short
is the order of operations:
{function, in_body_short, 1, 27}.
{label,26}.
{line,[{location,"foo.ex",23}]}.
{func_info,{atom,'Elixir.Foo'},{atom,in_body_short},1}.
{label,27}.
# same pattern-match on a map as before
{test,is_map,{f,26},[{x,0}]}.
# load {x, 1} with a
{get_map_elements,{f,29},
{tr,{x,0},{t_map,any,any}},
{list,[{atom,a},{x,1}]}}.
# slightly different register shuffling
# note that b isn't loaded anywhere yet
{allocate,1,2}.
{move,{x,0},{y,0}}.
{move,{x,1},{x,0}}.
{line,[{location,"foo.ex",24}]}.
{call_ext,1,{extfunc,'Elixir.IO',inspect,1}}.
# load {x, 0} with b, raise a key error if missing
{get_map_elements,{f,28},
{tr,{y,0},{t_map,any,any}},
{list,[{atom,b},{x,0}]}}.
{line,[{location,"foo.ex",25}]}.
{call_ext_last,1,{extfunc,'Elixir.IO',inspect,1},1}.
{label,28}.
{test_heap,4,0}.
{put_tuple2,{x,0},{list,[{atom,badkey},{atom,b},{y,0}]}}.
{call_ext_last,1,{extfunc,erlang,error,1},1}.
{label,29}.
{test_heap,4,1}.
{put_tuple2,{x,0},{list,[{atom,badkey},{atom,a},{x,0}]}}.
{line,[{location,"foo.ex",24}]}.
{call_ext_only,1,{extfunc,erlang,error,1}}.
Finally, check out in_body_exact_short
- you’ve seen all the pieces before!
The most different function of the set is the one that gives the compiler the least information - in_body_no_guard
:
{function, in_body_no_guard, 1, 22}.
{label,21}.
{line,[{location,"foo.ex",33}]}.
{func_info,{atom,'Elixir.Foo'},{atom,in_body_no_guard},1}.
{label,22}.
{allocate,1,1}.
{move,{x,0},{y,0}}.
# similar test to before, but a different destination
{test,is_map,{f,24},[{x,0}]}.
{get_map_elements,{f,23},
{tr,{x,0},{t_map,any,any}},
{list,[{atom,a},{x,0}]}}.
{jump,{f,25}}.
{label,23}.
{test_heap,4,0}.
{put_tuple2,{x,0},{list,[{atom,badkey},{atom,a},{y,0}]}}.
{line,[{location,"foo.ex",34}]}.
{call_ext_last,1,{extfunc,erlang,error,1},1}.
{label,24}.
# if params isn't a map, maybe it's a module to pass to Kernel.apply
{move,{atom,a},{x,1}}.
{apply,0}.
{label,25}.
{call_ext,1,{extfunc,'Elixir.IO',inspect,1}}.
{test,is_map,{f,27},[{y,0}]}.
{get_map_elements,{f,26},
{tr,{y,0},{t_map,any,any}},
{list,[{atom,b},{x,0}]}}.
{jump,{f,28}}.
{label,26}.
{test_heap,4,0}.
{put_tuple2,{x,0},{list,[{atom,badkey},{atom,b},{y,0}]}}.
{line,[{location,"foo.ex",35}]}.
{call_ext_last,1,{extfunc,erlang,error,1},1}.
{label,27}.
{move,{atom,b},{x,1}}.
{move,{y,0},{x,0}}.
{init_yregs,{list,[{y,0}]}}.
{apply,0}.
{label,28}.
{call_ext_last,1,{extfunc,'Elixir.IO',inspect,1},1}.
Since the compiler doesn’t know that params
is a map, it could also be an atom to pass to Kernel.apply
- the extra code handles that situation.
in_body_with_guard
confirms this, because it’s identical to in_body_no_guard
apart from the when is_map
- and produces identical assembly to in_body_short
.
In summary, the major difference between pattern-matching in the args and accessing in the body is the error handling - accesses in the body generate specific messages instead of the catch-all FunctionClauseError
. This produces slightly shorter functions for pattern-matching.