Way too much detail about matching in the head vs accessing maps in the body

(forking off from this thread because this is 1000% off-topic from where that thread started)

Here, I’m just curious about something. I wonder wether there is a performance gain between getting “sub-parameters” in a function head instead of doing it inside its body.

And more specifically:

a = param.a
b = param.b

vs

%{a: a, b: b} = param

So I got curious about this, and it’s quite a journey through BEAM assembly-land. The BEAM Book will be valuable to understand what’s going on here.

This module will be our testbed to see what these different styles of parameter access compile to:

defmodule Foo do
  @compile :S

  def in_args(%{a: a, b: b}) do
    IO.inspect(a)
    IO.inspect(b)
  end

  def in_body(%{} = params) do
    a = params.a
    b = params.b
    IO.inspect(a)
    IO.inspect(b)
  end

  def in_body_exact(%{a: _, b: _} = params) do
    a = params.a
    b = params.b
    IO.inspect(a)
    IO.inspect(b)
  end

  def in_body_short(%{} = params) do
    IO.inspect(params.a)
    IO.inspect(params.b)
  end

  def in_body_no_guard(params) do
    IO.inspect(params.a)
    IO.inspect(params.b)
  end
end

The @compile :S attribute tells the compiler to emit a .S file and then give up. Putting this code in a file foo.ex and then compiling it with elixirc foo.ex will produce an error from the compiler:

$ elixirc foo.ex

== Compilation error in file foo.ex ==
** (CompileError) foo.ex: could not compile module Foo. We expected the compiler to return a .beam binary but got something else. This usually happens because ERL_COMPILER_OPTIONS or @compile was set to change the compilation outcome in a way that is incompatible with Elixir

That last part was us :stuck_out_tongue: There should be a foo.ex.S file now.

Skimming past the module metadata etc, we find the first function (annotations inline):

{function, in_args, 1, 10}.
  {label,9}.
    {line,[{location,"foo.ex",4}]}.
    {func_info,{atom,'Elixir.Foo'},{atom,in_args},1}.
  {label,10}.
    # checks the shape of the input, bails with a FunctionClauseError if not a map
    {test,is_map,{f,9},[{x,0}]}.
    # loads {x,1} and {x, 2} with a and b, bails with a FunctionClauseError if not found
    {get_map_elements,{f,9},
                      {tr,{x,0},{t_map,any,any}},
                      {list,[{atom,b},{x,2},{atom,a},{x,1}]}}.
    # some register/stack shuffling. The input map in {x, 0} is no longer needed
    {allocate,1,3}.
    {move,{x,2},{y,0}}.
    {move,{x,1},{x,0}}.
    {line,[{location,"foo.ex",5}]}.
    {call_ext,1,{extfunc,'Elixir.IO',inspect,1}}.
    {move,{y,0},{x,0}}.
    {line,[{location,"foo.ex",6}]}.
    {call_ext_last,1,{extfunc,'Elixir.IO',inspect,1},1}.

Farther down is in_body:

{function, in_body, 1, 12}.
  {label,11}.
    {line,[{location,"foo.ex",9}]}.
    {func_info,{atom,'Elixir.Foo'},{atom,in_body},1}.
  {label,12}.
    # same as in_args
    # checks the shape of the input, bails with a FunctionClauseError if not a map
    {test,is_map,{f,11},[{x,0}]}.
    # loads {x, 1} and {x,2} with a and b, with different error destinations
    {get_map_elements,{f,14},
                      {tr,{x,0},{t_map,any,any}},
                      {list,[{atom,a},{x,1}]}}.
    {get_map_elements,{f,13},
                      {tr,{x,0},{t_map,any,any}},
                      {list,[{atom,b},{x,2}]}}.
    # identical code until label 13
    {allocate,1,3}.
    {move,{x,2},{y,0}}.
    {move,{x,1},{x,0}}.
    {line,[{location,"foo.ex",12}]}.
    {call_ext,1,{extfunc,'Elixir.IO',inspect,1}}.
    {move,{y,0},{x,0}}.
    {line,[{location,"foo.ex",13}]}.
    {call_ext_last,1,{extfunc,'Elixir.IO',inspect,1},1}.
  {label,13}.
    # error handling for when b is missing
    {test_heap,4,1}.
    {put_tuple2,{x,0},{list,[{atom,badkey},{atom,b},{x,0}]}}.
    {line,[{location,"foo.ex",11}]}.
    {call_ext_only,1,{extfunc,erlang,error,1}}.
  {label,14}.
    # error handling for when a is missing
    {test_heap,4,1}.
    {put_tuple2,{x,0},{list,[{atom,badkey},{atom,a},{x,0}]}}.
    {line,[{location,"foo.ex",10}]}.
    {call_ext_only,1,{extfunc,erlang,error,1}}.

The code difference here is because in_args and in_body have different behavior for badly-shaped input:

iex(2)> Foo.in_args(%{})
** (FunctionClauseError) no function clause matching in Foo.in_args/1    
    
    The following arguments were given to Foo.in_args/1:
    
        # 1
        %{}
    
    iex:4: Foo.in_args/1

vs

iex(2)> Foo.in_body(%{})
** (KeyError) key :a not found in: %{}
    iex:10: Foo.in_body/1

in_body_exact is a hybrid of the two approaches:

{function, in_body_exact, 1, 16}.
  {label,15}.
    {line,[{location,"foo.ex",16}]}.
    {func_info,{atom,'Elixir.Foo'},{atom,in_body_exact},1}.
  {label,16}.
    # same as the other two, gives a FunctionClauseError on non-maps
    {test,is_map,{f,15},[{x,0}]}.
    # only loads {x,1} with a, but gives FunctionClauseError for a missing key
    {get_map_elements,{f,15},
                      {tr,{x,0},{t_map,any,any}},
                      {list,[{atom,a},{x,1}]}}.
    # raise a FunctionClauseError if b is missing
    {test,has_map_fields,{f,15},{x,0},{list,[{atom,b}]}}.
    # load {x, 2} with b, with error handling like in_body
    {get_map_elements,{f,17},
                      {tr,{x,0},{t_map,any,any}},
                      {list,[{atom,b},{x,2}]}}.
    # same function body as before
    {allocate,1,3}.
    {move,{x,2},{y,0}}.
    {move,{x,1},{x,0}}.
    {line,[{location,"foo.ex",19}]}.
    {call_ext,1,{extfunc,'Elixir.IO',inspect,1}}.
    {move,{y,0},{x,0}}.
    {line,[{location,"foo.ex",20}]}.
    {call_ext_last,1,{extfunc,'Elixir.IO',inspect,1},1}.
  {label,17}.
    {test_heap,4,1}.
    {put_tuple2,{x,0},{list,[{atom,badkey},{atom,b},{x,0}]}}.
    {line,[{location,"foo.ex",18}]}.
    {call_ext_only,1,{extfunc,erlang,error,1}}.

I believe the error handling code in label 17 is unreachable because of the test in the preceding instruction; perhaps a subsequent optimization pass will spot that.

Digging in the source may yield more insight; I could see an optimization for the “match a single field from the map” being useful since patterns like def some_func(%Domain.Struct{} = s) are common.


The only difference between in_body and in_body_short is the order of operations:

{function, in_body_short, 1, 27}.
  {label,26}.
    {line,[{location,"foo.ex",23}]}.
    {func_info,{atom,'Elixir.Foo'},{atom,in_body_short},1}.
  {label,27}.
    # same pattern-match on a map as before
    {test,is_map,{f,26},[{x,0}]}.
    # load {x, 1} with a
    {get_map_elements,{f,29},
                      {tr,{x,0},{t_map,any,any}},
                      {list,[{atom,a},{x,1}]}}.
    # slightly different register shuffling
    # note that b isn't loaded anywhere yet
    {allocate,1,2}.
    {move,{x,0},{y,0}}.
    {move,{x,1},{x,0}}.
    {line,[{location,"foo.ex",24}]}.
    {call_ext,1,{extfunc,'Elixir.IO',inspect,1}}.
    # load {x, 0} with b, raise a key error if missing
    {get_map_elements,{f,28},
                      {tr,{y,0},{t_map,any,any}},
                      {list,[{atom,b},{x,0}]}}.
    {line,[{location,"foo.ex",25}]}.
    {call_ext_last,1,{extfunc,'Elixir.IO',inspect,1},1}.
  {label,28}.
    {test_heap,4,0}.
    {put_tuple2,{x,0},{list,[{atom,badkey},{atom,b},{y,0}]}}.
    {call_ext_last,1,{extfunc,erlang,error,1},1}.
  {label,29}.
    {test_heap,4,1}.
    {put_tuple2,{x,0},{list,[{atom,badkey},{atom,a},{x,0}]}}.
    {line,[{location,"foo.ex",24}]}.
    {call_ext_only,1,{extfunc,erlang,error,1}}.

Finally, check out in_body_exact_short - you’ve seen all the pieces before!


The most different function of the set is the one that gives the compiler the least information - in_body_no_guard:

{function, in_body_no_guard, 1, 22}.
  {label,21}.
    {line,[{location,"foo.ex",33}]}.
    {func_info,{atom,'Elixir.Foo'},{atom,in_body_no_guard},1}.
  {label,22}.
    {allocate,1,1}.
    {move,{x,0},{y,0}}.
    # similar test to before, but a different destination
    {test,is_map,{f,24},[{x,0}]}.
    {get_map_elements,{f,23},
                      {tr,{x,0},{t_map,any,any}},
                      {list,[{atom,a},{x,0}]}}.
    {jump,{f,25}}.
  {label,23}.
    {test_heap,4,0}.
    {put_tuple2,{x,0},{list,[{atom,badkey},{atom,a},{y,0}]}}.
    {line,[{location,"foo.ex",34}]}.
    {call_ext_last,1,{extfunc,erlang,error,1},1}.
  {label,24}.
    # if params isn't a map, maybe it's a module to pass to Kernel.apply
    {move,{atom,a},{x,1}}.
    {apply,0}.
  {label,25}.
    {call_ext,1,{extfunc,'Elixir.IO',inspect,1}}.
    {test,is_map,{f,27},[{y,0}]}.
    {get_map_elements,{f,26},
                      {tr,{y,0},{t_map,any,any}},
                      {list,[{atom,b},{x,0}]}}.
    {jump,{f,28}}.
  {label,26}.
    {test_heap,4,0}.
    {put_tuple2,{x,0},{list,[{atom,badkey},{atom,b},{y,0}]}}.
    {line,[{location,"foo.ex",35}]}.
    {call_ext_last,1,{extfunc,erlang,error,1},1}.
  {label,27}.
    {move,{atom,b},{x,1}}.
    {move,{y,0},{x,0}}.
    {init_yregs,{list,[{y,0}]}}.
    {apply,0}.
  {label,28}.
    {call_ext_last,1,{extfunc,'Elixir.IO',inspect,1},1}.

Since the compiler doesn’t know that params is a map, it could also be an atom to pass to Kernel.apply - the extra code handles that situation.

in_body_with_guard confirms this, because it’s identical to in_body_no_guard apart from the when is_map - and produces identical assembly to in_body_short.


In summary, the major difference between pattern-matching in the args and accessing in the body is the error handling - accesses in the body generate specific messages instead of the catch-all FunctionClauseError. This produces slightly shorter functions for pattern-matching.

14 Likes

Nice research! This aligns with what I recall José saying in the past: use pattern matching in the head only for function clause selection. Use destructuring in the function body for everything else. Following that pattern makes the code more visually pleasing. And your research say it does so without material performance penalty.

10 Likes

Thanks for digging into this @al2o3cr!

2 Likes