Benchmarking Elixir Argument Call Order

benchmarking

#1

I was just curious how the recent OTP versions changed argument processing speed, it used to be that changing the ‘last’ argument but keeping the early ones the same was the most efficient but it seems indications are a bit more… odd now, so a quick benchmark!

First, changing all the arguments:

iex(1)> defmodule Testering do
...(1)>   def recurse1([]=l), do: l
...(1)>   def recurse1([_h|t]), do: recurse1(t)
...(1)> 
...(1)>   def recurse2f([], acc), do: acc
...(1)>   def recurse2f([h|t], acc), do: recurse2f(t, acc+h)
...(1)> 
...(1)>   def recurse2b(acc, []), do: acc
...(1)>   def recurse2b(acc, [h|t]), do: recurse2b(acc+h, t)
...(1)> 
...(1)>   def recurse5f([], a0, a1, a2, a3), do: {a0, a1, a2, a3}
...(1)>   def recurse5f([h|t], a0, a1, a2, a3), do: recurse5f(t, a0+h, a1+h+1, a2+h+2, a3+h+3)
...(1)> 
...(1)>   def recurse5b(a0, a1, a2, a3, []), do: {a0, a1, a2, a3}
...(1)>   def recurse5b(a0, a1, a2, a3, [h|t]), do: recurse5b(a0+h, a1+h+1, a2+h+2, a3+h+3, t)
...(1)> 
...(1)>   def recurse9f([], a0, a1, a2, a3, a4, a5, a6, a7), do: {a0, a1, a2, a3, a4, a5, a6, a7}
...(1)>   def recurse9f([h|t], a0, a1, a2, a3, a4, a5, a6, a7), do: recurse9f(t, a0+h, a1+h+1, a2+h+2, a3+h+3, a4+h+4, a5+h+5, a6+h+6, a7+h+7)
...(1)> 
...(1)>   def recurse9b(a0, a1, a2, a3, a4, a5, a6, a7, []), do: {a0, a1, a2, a3, a4, a5, a6, a7}
...(1)>   def recurse9b(a0, a1, a2, a3, a4, a5, a6, a7, [h|t]), do: recurse9b(a0+h, a1+h+1, a2+h+2, a3+h+3, a4+h+4, a5+h+5, a6+h+6, a7+h+7, t)
...(1)> end
{:module, Testering,
 <<70, 79, 82, 49, 0, 0, 11, 12, 66, 69, 65, 77, 65, 116, 85, 56, 0, 0, 0, 194,                                                       
   0, 0, 0, 20, 16, 69, 108, 105, 120, 105, 114, 46, 84, 101, 115, 116, 101,                                                          
   114, 105, 110, 103, 8, 95, 95, 105, 110, 102, ...>>, {:recurse9b, 9}}
iex(2)> 
nil
iex(3)> inputs = %{
...(3)>   "10" => :lists.seq(1, 10),
...(3)>   "1000" => :lists.seq(1, 1000),
...(3)>   "100000" => :lists.seq(1, 100000),
...(3)> }
%{
  "10" => [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
  "1000" => [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
   20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38,
   39, 40, 41, 42, 43, 44, 45, 46, 47, 48, ...],                                                                                      
  "100000" => [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
   19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37,
   38, 39, 40, 41, 42, 43, 44, 45, 46, 47, ...]
}
iex(4)> 
nil
iex(5)> actions = %{
...(5)>   "recurse1" => &Testering.recurse1(&1),
...(5)>   "recurse2f" => &Testering.recurse2f(&1, 0),
...(5)>   "recurse2b" => &Testering.recurse2b(0, &1),
...(5)>   "recurse5f" => &Testering.recurse5f(&1, 0, 1, 2, 3),
...(5)>   "recurse5b" => &Testering.recurse5b(0, 1, 2, 3, &1),
...(5)>   "recurse9f" => &Testering.recurse9f(&1, 0, 1, 2, 3, 4, 5, 6, 7),
...(5)>   "recurse9b" => &Testering.recurse9b(0, 1, 2, 3, 4, 5, 6, 7, &1),
...(5)> }
%{
  "recurse1" => &Testering.recurse1/1,
  "recurse2b" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse2f" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse5b" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse5f" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse9b" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse9f" => #Function<6.128620087/1 in :erl_eval.expr/5>                                                                         
}
iex(6)> 
nil
iex(7)> results = Benchee.run(actions, inputs: inputs, time: 5, warmup: 5, print: %{fast_warning: false}); :ok
Operating System: Linux
CPU Information: Blah
Number of Available Cores: 6
Available memory: 16.430148 GB
Elixir 1.7.4
Erlang 21.1.1
Benchmark suite executing with the following configuration:
warmup: 5.00 s
time: 5.00 s
parallel: 1
inputs: 10, 1000, 100000
Estimated total run time: 3.50 min



Benchmarking with input 10:
Benchmarking recurse1...
Benchmarking recurse2b...
Benchmarking recurse2f...
Benchmarking recurse5b...
Benchmarking recurse5f...
Benchmarking recurse9b...
Benchmarking recurse9f...

Benchmarking with input 1000:
Benchmarking recurse1...
Benchmarking recurse2b...
Benchmarking recurse2f...
Benchmarking recurse5b...
Benchmarking recurse5f...
Benchmarking recurse9b...
Benchmarking recurse9f...

Benchmarking with input 100000:
Benchmarking recurse1...
Benchmarking recurse2b...
Benchmarking recurse2f...
Benchmarking recurse5b...
Benchmarking recurse5f...
Benchmarking recurse9b...
Benchmarking recurse9f...

Result:

##### With input 10 #####
Name                ips        average  deviation         median
recurse1     10508.51 K      0.0952 μs   ±165.42%      0.0900 μs
recurse2b      394.27 K        2.54 μs  ±1538.40%        2.00 μs
recurse2f      393.46 K        2.54 μs  ±1504.22%        2.00 μs
recurse5b      240.24 K        4.16 μs   ±784.57%        4.00 μs
recurse5f      219.89 K        4.55 μs   ±977.01%        4.00 μs
recurse9f      174.10 K        5.74 μs   ±638.64%        5.00 μs
recurse9b      167.04 K        5.99 μs   ±726.56%        5.00 μs

Comparison: 
recurse1     10508.51 K
recurse2b      394.27 K - 26.65x slower
recurse2f      393.46 K - 26.71x slower
recurse5b      240.24 K - 43.74x slower
recurse5f      219.89 K - 47.79x slower
recurse9f      174.10 K - 60.36x slower
recurse9b      167.04 K - 62.91x slower

##### With input 1000 #####
Name                ips        average  deviation         median
recurse1       203.42 K        4.92 μs    ±11.57%        4.80 μs
recurse2f       92.04 K       10.86 μs   ±178.86%       10.00 μs
recurse2b       78.58 K       12.73 μs   ±190.28%       12.00 μs
recurse5b       20.49 K       48.79 μs    ±16.04%       48.00 μs
recurse5f       17.35 K       57.65 μs     ±8.96%       56.00 μs
recurse9b       12.76 K       78.39 μs     ±7.43%       77.00 μs
recurse9f       11.16 K       89.63 μs     ±6.63%       88.00 μs

Comparison: 
recurse1       203.42 K
recurse2f       92.04 K - 2.21x slower
recurse2b       78.58 K - 2.59x slower
recurse5b       20.49 K - 9.93x slower
recurse5f       17.35 K - 11.73x slower
recurse9b       12.76 K - 15.95x slower
recurse9f       11.16 K - 18.23x slower

##### With input 100000 #####
Name                ips        average  deviation         median
recurse1        1909.38        0.52 ms     ±6.17%        0.51 ms
recurse2f       1125.82        0.89 ms    ±12.92%        0.87 ms
recurse2b        908.18        1.10 ms    ±11.61%        1.08 ms
recurse5b        212.14        4.71 ms     ±5.44%        4.65 ms
recurse5f        182.85        5.47 ms     ±5.60%        5.40 ms
recurse9b        130.85        7.64 ms     ±7.98%        7.55 ms
recurse9f        113.08        8.84 ms     ±5.17%        8.75 ms

Comparison: 
recurse1        1909.38
recurse2f       1125.82 - 1.70x slower
recurse2b        908.18 - 2.10x slower
recurse5b        212.14 - 9.00x slower
recurse5f        182.85 - 10.44x slower
recurse9b        130.85 - 14.59x slower
recurse9f        113.08 - 16.88x slower

So a baseline, now testing with only mutating the front or back of the arguments only:

iex(1)> defmodule Testering do
...(1)>   def recurse1([]=l), do: l
...(1)>   def recurse1([_h|t]), do: recurse1(t)
...(1)> 
...(1)>   def recurse2f([], acc), do: acc
...(1)>   def recurse2f([h|t], acc), do: recurse2f(t, acc+h)
...(1)> 
...(1)>   def recurse2fn([], acc), do: acc
...(1)>   def recurse2fn([h|t], acc), do: recurse2fn(t, acc)
...(1)> 
...(1)>   def recurse2b(acc, []), do: acc
...(1)>   def recurse2b(acc, [h|t]), do: recurse2b(acc+h, t)
...(1)> 
...(1)>   def recurse2bn(acc, []), do: acc
...(1)>   def recurse2bn(acc, [h|t]), do: recurse2bn(acc, t)
...(1)> 
...(1)>   def recurse5f([], a0, a1, a2, a3), do: {a0, a1, a2, a3}
...(1)>   def recurse5f([h|t], a0, a1, a2, a3), do: recurse5f(t, a0+h, a1, a2, a3)
...(1)> 
...(1)>   def recurse5b(a0, a1, a2, a3, []), do: {a0, a1, a2, a3}
...(1)>   def recurse5b(a0, a1, a2, a3, [h|t]), do: recurse5b(a0, a1, a2, a3+h, t)
...(1)> 
...(1)>   def recurse5m(a0, a1, a2, a3, []), do: {a0, a1, a2, a3}
...(1)>   def recurse5m(a0, a1, a2, a3, [h|t]), do: recurse5m(a0+h, a1, a2, a3, t)
...(1)> 
...(1)>   def recurse9f([], a0, a1, a2, a3, a4, a5, a6, a7), do: {a0, a1, a2, a3, a4, a5, a6, a7}
...(1)>   def recurse9f([h|t], a0, a1, a2, a3, a4, a5, a6, a7), do: recurse9f(t, a0+h, a1, a2, a3, a4, a5, a6, a7)
...(1)> 
...(1)>   def recurse9b(a0, a1, a2, a3, a4, a5, a6, a7, []), do: {a0, a1, a2, a3, a4, a5, a6, a7}
...(1)>   def recurse9b(a0, a1, a2, a3, a4, a5, a6, a7, [h|t]), do: recurse9b(a0, a1, a2, a3, a4, a5, a6, a7+h, t)
...(1)> 
...(1)>   def recurse9m(a0, a1, a2, a3, a4, a5, a6, a7, []), do: {a0, a1, a2, a3, a4, a5, a6, a7}
...(1)>   def recurse9m(a0, a1, a2, a3, a4, a5, a6, a7, [h|t]), do: recurse9m(a0+h, a1, a2, a3, a4, a5, a6, a7, t)
...(1)> end
warning: variable "h" is unused
  iex:9

warning: variable "h" is unused
  iex:15

{:module, Testering,
 <<70, 79, 82, 49, 0, 0, 11, 60, 66, 69, 65, 77, 65, 116, 85, 56, 0, 0, 0, 236,                                                       
   0, 0, 0, 24, 16, 69, 108, 105, 120, 105, 114, 46, 84, 101, 115, 116, 101,                                                          
   114, 105, 110, 103, 8, 95, 95, 105, 110, 102, ...>>, {:recurse9m, 9}}
iex(2)> 
nil
iex(3)> inputs = %{
...(3)>   "10" => :lists.seq(1, 10),
...(3)>   "1000" => :lists.seq(1, 1000),
...(3)>   "100000" => :lists.seq(1, 100000),
...(3)>   "10000000" => :lists.seq(1, 10000000),
...(3)> }
%{
  "10" => [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
  "1000" => [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
   20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38,
   39, 40, 41, 42, 43, 44, 45, 46, 47, 48, ...],                                                                                      
  "100000" => [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
   19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37,
   38, 39, 40, 41, 42, 43, 44, 45, 46, 47, ...]
}
iex(4)> 
nil
iex(5)> actions = %{
...(5)>   "recurse1" => &Testering.recurse1(&1),
...(5)>   "recurse2f" => &Testering.recurse2f(&1, 0),
...(5)>   "recurse2fn" => &Testering.recurse2fn(&1, 0),
...(5)>   "recurse2b" => &Testering.recurse2b(0, &1),
...(5)>   "recurse2bn" => &Testering.recurse2bn(0, &1),
...(5)>   "recurse5f" => &Testering.recurse5f(&1, 0, 1, 2, 3),
...(5)>   "recurse5b" => &Testering.recurse5b(0, 1, 2, 3, &1),
...(5)>   "recurse5m" => &Testering.recurse5m(0, 1, 2, 3, &1),
...(5)>   "recurse9f" => &Testering.recurse9f(&1, 0, 1, 2, 3, 4, 5, 6, 7),
...(5)>   "recurse9b" => &Testering.recurse9b(0, 1, 2, 3, 4, 5, 6, 7, &1),
...(5)>   "recurse9m" => &Testering.recurse9m(0, 1, 2, 3, 4, 5, 6, 7, &1),
...(5)> }
%{
  "recurse1" => &Testering.recurse1/1,
  "recurse2b" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse2bn" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse2f" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse2fn" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse5b" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse5f" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse5m" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse9b" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse9f" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse9m" => #Function<6.128620087/1 in :erl_eval.expr/5>
}
iex(6)> 
nil
iex(7)> results = Benchee.run(actions, inputs: inputs, time: 5, warmup: 5, print: %{fast_warning: false}); :ok
Operating System: Linux
CPU Information: Blah
Number of Available Cores: 6
Available memory: 16.430148 GB
Elixir 1.7.4
Erlang 21.1.1
Benchmark suite executing with the following configuration:
warmup: 5.00 s
time: 5.00 s
parallel: 1
inputs: 10, 1000, 100000
Estimated total run time: 5.50 min



Benchmarking with input 10:
Benchmarking recurse1...
Benchmarking recurse2b...
Benchmarking recurse2bn...
Benchmarking recurse2f...
Benchmarking recurse2fn...
Benchmarking recurse5b...
Benchmarking recurse5f...
Benchmarking recurse5m...
Benchmarking recurse9b...
Benchmarking recurse9f...
Benchmarking recurse9m...

Benchmarking with input 1000:
Benchmarking recurse1...
Benchmarking recurse2b...
Benchmarking recurse2bn...
Benchmarking recurse2f...
Benchmarking recurse2fn...
Benchmarking recurse5b...
Benchmarking recurse5f...
Benchmarking recurse5m...
Benchmarking recurse9b...
Benchmarking recurse9f...
Benchmarking recurse9m...

Benchmarking with input 100000:
Benchmarking recurse1...
Benchmarking recurse2b...
Benchmarking recurse2bn...
Benchmarking recurse2f...
Benchmarking recurse2fn...
Benchmarking recurse5b...
Benchmarking recurse5f...
Benchmarking recurse5m...
Benchmarking recurse9b...
Benchmarking recurse9f...
Benchmarking recurse9m...

Benchmarking with input 10000000:
Benchmarking recurse1...
Benchmarking recurse2b...
Benchmarking recurse2bn...
Benchmarking recurse2f...
Benchmarking recurse2fn...
Benchmarking recurse5b...
Benchmarking recurse5f...
Benchmarking recurse5m...
Benchmarking recurse9b...
Benchmarking recurse9f...
Benchmarking recurse9m...

And results:


##### With input 10 #####
Name                 ips        average  deviation         median
recurse1      10576.33 K      0.0946 μs   ±153.38%      0.0900 μs
recurse2b       392.87 K        2.55 μs  ±1634.44%        2.00 μs
recurse2bn      390.07 K        2.56 μs  ±1648.19%        2.00 μs
recurse2f       389.76 K        2.57 μs  ±1658.86%        2.00 μs
recurse2fn      384.92 K        2.60 μs  ±1781.32%        2.00 μs
recurse5f       269.04 K        3.72 μs  ±1206.63%        3.00 μs
recurse5m       267.45 K        3.74 μs  ±1176.88%        3.00 μs
recurse5b       263.16 K        3.80 μs   ±925.19%        3.00 μs
recurse9f       204.08 K        4.90 μs   ±689.09%        4.00 μs
recurse9b       200.81 K        4.98 μs   ±848.10%        4.00 μs
recurse9m       194.60 K        5.14 μs   ±768.05%        4.00 μs

Comparison: 
recurse1      10576.33 K
recurse2b       392.87 K - 26.92x slower
recurse2bn      390.07 K - 27.11x slower
recurse2f       389.76 K - 27.14x slower
recurse2fn      384.92 K - 27.48x slower
recurse5f       269.04 K - 39.31x slower
recurse5m       267.45 K - 39.55x slower
recurse5b       263.16 K - 40.19x slower
recurse9f       204.08 K - 51.82x slower
recurse9b       200.81 K - 52.67x slower
recurse9m       194.60 K - 54.35x slower

##### With input 1000 #####
Name                 ips        average  deviation         median
recurse1        225.96 K        4.43 μs    ±20.29%        4.00 μs
recurse2bn      134.61 K        7.43 μs   ±416.65%        7.00 μs
recurse2fn      133.89 K        7.47 μs   ±437.14%        7.00 μs
recurse2f        94.55 K       10.58 μs   ±287.21%       10.00 μs
recurse5f        84.26 K       11.87 μs   ±296.58%       11.00 μs
recurse2b        79.13 K       12.64 μs   ±181.35%       12.00 μs
recurse9f        75.06 K       13.32 μs   ±223.36%       13.00 μs
recurse5m        73.19 K       13.66 μs   ±141.63%       13.00 μs
recurse5b        71.95 K       13.90 μs   ±169.30%       13.00 μs
recurse9b        66.46 K       15.05 μs   ±175.73%       14.00 μs
recurse9m        65.93 K       15.17 μs   ±199.94%       14.00 μs

Comparison: 
recurse1        225.96 K
recurse2bn      134.61 K - 1.68x slower
recurse2fn      133.89 K - 1.69x slower
recurse2f        94.55 K - 2.39x slower
recurse5f        84.26 K - 2.68x slower
recurse2b        79.13 K - 2.86x slower
recurse9f        75.06 K - 3.01x slower
recurse5m        73.19 K - 3.09x slower
recurse5b        71.95 K - 3.14x slower
recurse9b        66.46 K - 3.40x slower
recurse9m        65.93 K - 3.43x slower

##### With input 100000 #####
Name                 ips        average  deviation         median
recurse2fn        1.95 K      512.46 μs     ±9.66%      503.00 μs
recurse2bn        1.92 K      521.48 μs     ±9.22%      509.00 μs
recurse1          1.89 K      530.38 μs     ±6.23%      507.00 μs
recurse2f         1.12 K      891.94 μs     ±7.84%      873.00 μs
recurse9f         1.11 K      897.43 μs     ±5.22%      876.00 μs
recurse5f         1.09 K      915.82 μs     ±8.50%      887.00 μs
recurse9m         0.93 K     1079.86 μs     ±6.91%     1063.00 μs
recurse5b         0.92 K     1083.50 μs    ±11.10%     1063.00 μs
recurse9b         0.92 K     1092.23 μs     ±6.83%     1066.00 μs
recurse2b         0.90 K     1105.06 μs     ±5.81%     1076.00 μs
recurse5m         0.90 K     1109.84 μs    ±12.90%     1074.00 μs

Comparison: 
recurse2fn        1.95 K
recurse2bn        1.92 K - 1.02x slower
recurse1          1.89 K - 1.03x slower
recurse2f         1.12 K - 1.74x slower
recurse9f         1.11 K - 1.75x slower
recurse5f         1.09 K - 1.79x slower
recurse9m         0.93 K - 2.11x slower
recurse5b         0.92 K - 2.11x slower
recurse9b         0.92 K - 2.13x slower
recurse2b         0.90 K - 2.16x slower
recurse5m         0.90 K - 2.17x slower
##### With input 10000000 #####
Name                 ips        average  deviation         median
recurse2fn         17.23       58.03 ms     ±1.68%       57.72 ms
recurse1           16.85       59.35 ms     ±2.72%       58.84 ms
recurse2bn         15.02       66.58 ms     ±2.62%       65.96 ms
recurse2f          10.44       95.80 ms     ±2.48%       94.71 ms
recurse5f          10.37       96.42 ms     ±2.22%       95.46 ms
recurse9f           9.50      105.24 ms     ±1.88%      104.57 ms
recurse5m           8.72      114.63 ms     ±1.65%      114.20 ms
recurse9b           8.71      114.80 ms     ±2.10%      114.03 ms
recurse5b           8.70      114.94 ms     ±1.99%      114.11 ms
recurse9m           8.69      115.09 ms     ±1.49%      114.54 ms
recurse2b           8.62      115.96 ms     ±1.85%      115.10 ms

Comparison: 
recurse2fn         17.23
recurse1           16.85 - 1.02x slower
recurse2bn         15.02 - 1.15x slower
recurse2f          10.44 - 1.65x slower
recurse5f          10.37 - 1.66x slower
recurse9f           9.50 - 1.81x slower
recurse5m           8.72 - 1.98x slower
recurse9b           8.71 - 1.98x slower
recurse5b           8.70 - 1.98x slower
recurse9m           8.69 - 1.98x slower
recurse2b           8.62 - 2.00x slower

So interestingly it seems mutating the front arguments is faster now! That’s entirely backwards from what I remembered testing back in the old OTP-14 or so days! Quite interesting. :slight_smile:

So what about non-TCO:

iex(1)> defmodule Testering do
...(1)>   def recurse1([]=l), do: 0
...(1)>   def recurse1([_h|t]), do: 1+recurse1(t)
...(1)> 
...(1)>   def recurse2f([], acc), do: acc
...(1)>   def recurse2f([h|t], acc), do: 1+recurse2f(t, acc+h)
...(1)> 
...(1)>   def recurse2fn([], acc), do: acc
...(1)>   def recurse2fn([h|t], acc), do: 1+recurse2fn(t, acc)
...(1)> 
...(1)>   def recurse2b(acc, []), do: acc
...(1)>   def recurse2b(acc, [h|t]), do: 1+recurse2b(acc+h, t)
...(1)> 
...(1)>   def recurse2bn(acc, []), do: acc
...(1)>   def recurse2bn(acc, [h|t]), do: 1+recurse2bn(acc, t)
...(1)> 
...(1)>   def recurse5f([], a0, a1, a2, a3), do: a0
...(1)>   def recurse5f([h|t], a0, a1, a2, a3), do: 1+recurse5f(t, a0+h, a1, a2, a3)
...(1)> 
...(1)>   def recurse5b(a0, a1, a2, a3, []), do: a3
...(1)>   def recurse5b(a0, a1, a2, a3, [h|t]), do: 1+recurse5b(a0, a1, a2, a3+h, t)
...(1)> 
...(1)>   def recurse5m(a0, a1, a2, a3, []), do: a0
...(1)>   def recurse5m(a0, a1, a2, a3, [h|t]), do: 1+recurse5m(a0+h, a1, a2, a3, t)
...(1)> 
...(1)>   def recurse9f([], a0, a1, a2, a3, a4, a5, a6, a7), do: a0
...(1)>   def recurse9f([h|t], a0, a1, a2, a3, a4, a5, a6, a7), do: 1+recurse9f(t, a0+h, a1, a2, a3, a4, a5, a6, a7)
...(1)> 
...(1)>   def recurse9b(a0, a1, a2, a3, a4, a5, a6, a7, []), do: a7
...(1)>   def recurse9b(a0, a1, a2, a3, a4, a5, a6, a7, [h|t]), do: 1+recurse9b(a0, a1, a2, a3, a4, a5, a6, a7+h, t)
...(1)> 
...(1)>   def recurse9m(a0, a1, a2, a3, a4, a5, a6, a7, []), do: a0
...(1)>   def recurse9m(a0, a1, a2, a3, a4, a5, a6, a7, [h|t]), do: 1+recurse9m(a0+h, a1, a2, a3, a4, a5, a6, a7, t)
...(1)> end
warning: variable "l" is unused
  iex:2

warning: variable "h" is unused
  iex:9

warning: variable "h" is unused
  iex:15

warning: variable "a1" is unused
  iex:17

warning: variable "a2" is unused
  iex:17

warning: variable "a3" is unused
  iex:17

warning: variable "a0" is unused
  iex:20

warning: variable "a1" is unused
  iex:20

warning: variable "a2" is unused
  iex:20

warning: variable "a1" is unused
  iex:23

warning: variable "a2" is unused
  iex:23

warning: variable "a3" is unused
  iex:23

warning: variable "a1" is unused
  iex:26

warning: variable "a2" is unused
  iex:26

warning: variable "a3" is unused
  iex:26

warning: variable "a4" is unused
  iex:26

warning: variable "a5" is unused
  iex:26

warning: variable "a6" is unused
  iex:26

warning: variable "a7" is unused
  iex:26

warning: variable "a0" is unused
  iex:29

warning: variable "a1" is unused
  iex:29

warning: variable "a2" is unused
  iex:29

warning: variable "a3" is unused
  iex:29

warning: variable "a4" is unused
  iex:29

warning: variable "a5" is unused
  iex:29

warning: variable "a6" is unused
  iex:29

warning: variable "a1" is unused
  iex:32

warning: variable "a2" is unused
  iex:32

warning: variable "a3" is unused
  iex:32

warning: variable "a4" is unused
  iex:32

warning: variable "a5" is unused
  iex:32

warning: variable "a6" is unused
  iex:32

warning: variable "a7" is unused
  iex:32

{:module, Testering,
 <<70, 79, 82, 49, 0, 0, 11, 136, 66, 69, 65, 77, 65, 116, 85, 56, 0, 0, 0, 236,                                                      
   0, 0, 0, 24, 16, 69, 108, 105, 120, 105, 114, 46, 84, 101, 115, 116, 101,                                                          
   114, 105, 110, 103, 8, 95, 95, 105, 110, 102, ...>>, {:recurse9m, 9}}
iex(2)> 
nil
iex(3)> inputs = %{
...(3)>   "10" => :lists.seq(1, 10),
...(3)>   "1000" => :lists.seq(1, 1000),
...(3)>   "100000" => :lists.seq(1, 100000),
...(3)>   "10000000" => :lists.seq(1, 10000000),
...(3)> }
%{
  "10" => [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
  "1000" => [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
   20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38,
   39, 40, 41, 42, 43, 44, 45, 46, 47, 48, ...],                                                                                      
  "100000" => [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
   19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37,
   38, 39, 40, 41, 42, 43, 44, 45, 46, 47, ...],
  "10000000" => [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
   19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37,
   38, 39, 40, 41, 42, 43, 44, 45, 46, ...]
}
iex(4)> 
nil
iex(5)> actions = %{
...(5)>   "recurse1" => &Testering.recurse1(&1),
...(5)>   "recurse2f" => &Testering.recurse2f(&1, 0),
...(5)>   "recurse2fn" => &Testering.recurse2fn(&1, 0),
...(5)>   "recurse2b" => &Testering.recurse2b(0, &1),
...(5)>   "recurse2bn" => &Testering.recurse2bn(0, &1),
...(5)>   "recurse5f" => &Testering.recurse5f(&1, 0, 1, 2, 3),
...(5)>   "recurse5b" => &Testering.recurse5b(0, 1, 2, 3, &1),
...(5)>   "recurse5m" => &Testering.recurse5m(0, 1, 2, 3, &1),
...(5)>   "recurse9f" => &Testering.recurse9f(&1, 0, 1, 2, 3, 4, 5, 6, 7),
...(5)>   "recurse9b" => &Testering.recurse9b(0, 1, 2, 3, 4, 5, 6, 7, &1),
...(5)>   "recurse9m" => &Testering.recurse9m(0, 1, 2, 3, 4, 5, 6, 7, &1),
...(5)> }
%{
  "recurse1" => &Testering.recurse1/1,
  "recurse2b" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse2bn" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse2f" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse2fn" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse5b" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse5f" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse5m" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse9b" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse9f" => #Function<6.128620087/1 in :erl_eval.expr/5>,
  "recurse9m" => #Function<6.128620087/1 in :erl_eval.expr/5>                                                                         
}
iex(6)> 
nil
iex(7)> results = Benchee.run(actions, inputs: inputs, time: 5, warmup: 5, print: %{fast_warning: false}); :ok
Operating System: Linux
CPU Information: Blah
Number of Available Cores: 6
Available memory: 16.430148 GB
Elixir 1.7.4
Erlang 21.1.1
Benchmark suite executing with the following configuration:
warmup: 5.00 s
time: 5.00 s
parallel: 1
inputs: 10, 1000, 100000, 10000000
Estimated total run time: 7.33 min



Benchmarking with input 10:
Benchmarking recurse1...
Benchmarking recurse2b...
Benchmarking recurse2bn...
Benchmarking recurse2f...
Benchmarking recurse2fn...
Benchmarking recurse5b...
Benchmarking recurse5f...
Benchmarking recurse5m...
Benchmarking recurse9b...
Benchmarking recurse9f...
Benchmarking recurse9m...

Benchmarking with input 1000:
Benchmarking recurse1...
Benchmarking recurse2b...
Benchmarking recurse2bn...
Benchmarking recurse2f...
Benchmarking recurse2fn...
Benchmarking recurse5b...
Benchmarking recurse5f...
Benchmarking recurse5m...
Benchmarking recurse9b...
Benchmarking recurse9f...
Benchmarking recurse9m...

Benchmarking with input 100000:
Benchmarking recurse1...
Benchmarking recurse2b...
Benchmarking recurse2bn...
Benchmarking recurse2f...
Benchmarking recurse2fn...
Benchmarking recurse5b...
Benchmarking recurse5f...
Benchmarking recurse5m...
Benchmarking recurse9b...
Benchmarking recurse9f...
Benchmarking recurse9m...

Benchmarking with input 10000000:
Benchmarking recurse1...
Benchmarking recurse2b...
Benchmarking recurse2bn...
Benchmarking recurse2f...
Benchmarking recurse2fn...
Benchmarking recurse5b...
Benchmarking recurse5f...
Benchmarking recurse5m...
Benchmarking recurse9b...
Benchmarking recurse9f...
Benchmarking recurse9m...

Results:

##### With input 10 #####
Name                 ips        average  deviation         median
recurse1       5884.23 K       0.170 μs   ±217.91%       0.160 μs
recurse2bn      395.73 K        2.53 μs  ±1552.52%        2.00 μs
recurse2f       395.55 K        2.53 μs  ±1662.25%        2.00 μs
recurse2fn      391.58 K        2.55 μs  ±1703.19%        2.00 μs
recurse2b       391.20 K        2.56 μs  ±1664.04%        2.00 μs
recurse5m       267.55 K        3.74 μs  ±1022.72%        3.00 μs
recurse5b       264.32 K        3.78 μs  ±1237.89%        3.00 μs
recurse5f       258.29 K        3.87 μs   ±892.53%        3.00 μs
recurse9f       204.89 K        4.88 μs   ±716.82%        4.00 μs
recurse9m       203.65 K        4.91 μs   ±714.30%        4.00 μs
recurse9b       203.62 K        4.91 μs   ±713.38%        4.00 μs

Comparison: 
recurse1       5884.23 K
recurse2bn      395.73 K - 14.87x slower
recurse2f       395.55 K - 14.88x slower
recurse2fn      391.58 K - 15.03x slower
recurse2b       391.20 K - 15.04x slower
recurse5m       267.55 K - 21.99x slower
recurse5b       264.32 K - 22.26x slower
recurse5f       258.29 K - 22.78x slower
recurse9f       204.89 K - 28.72x slower
recurse9m       203.65 K - 28.89x slower
recurse9b       203.62 K - 28.90x slower

##### With input 1000 #####
Name                 ips        average  deviation         median
recurse1         97.01 K       10.31 μs   ±118.39%       10.00 μs
recurse2fn       76.58 K       13.06 μs   ±182.34%       13.00 μs
recurse2bn       73.59 K       13.59 μs   ±218.43%       13.00 μs
recurse2f        58.18 K       17.19 μs   ±216.01%       17.00 μs
recurse5f        54.10 K       18.49 μs    ±57.76%       18.00 μs
recurse9f        52.07 K       19.21 μs    ±57.79%       19.00 μs
recurse2b        49.76 K       20.10 μs    ±52.32%       20.00 μs
recurse5b        48.40 K       20.66 μs    ±24.64%       20.00 μs
recurse5m        48.20 K       20.75 μs    ±69.18%       20.00 μs
recurse9b        44.33 K       22.56 μs    ±92.41%       22.00 μs
recurse9m        44.29 K       22.58 μs    ±60.78%       22.00 μs

Comparison: 
recurse1         97.01 K
recurse2fn       76.58 K - 1.27x slower
recurse2bn       73.59 K - 1.32x slower
recurse2f        58.18 K - 1.67x slower
recurse5f        54.10 K - 1.79x slower
recurse9f        52.07 K - 1.86x slower
recurse2b        49.76 K - 1.95x slower
recurse5b        48.40 K - 2.00x slower
recurse5m        48.20 K - 2.01x slower
recurse9b        44.33 K - 2.19x slower
recurse9m        44.29 K - 2.19x slower

##### With input 100000 #####
Name                 ips        average  deviation         median
recurse2fn        895.55        1.12 ms    ±15.37%        1.10 ms
recurse2bn        805.73        1.24 ms    ±27.33%        1.17 ms
recurse1          738.43        1.35 ms    ±44.65%        1.10 ms
recurse2f         647.77        1.54 ms     ±9.69%        1.51 ms
recurse5f         617.39        1.62 ms    ±11.58%        1.58 ms
recurse9f         580.50        1.72 ms    ±42.02%        1.55 ms
recurse9m         543.77        1.84 ms    ±13.20%        1.79 ms
recurse5b         541.94        1.85 ms    ±13.17%        1.81 ms
recurse5m         540.25        1.85 ms     ±7.65%        1.82 ms
recurse2b         475.47        2.10 ms    ±36.08%        1.81 ms
recurse9b         398.33        2.51 ms    ±60.55%        1.86 ms

Comparison: 
recurse2fn        895.55
recurse2bn        805.73 - 1.11x slower
recurse1          738.43 - 1.21x slower
recurse2f         647.77 - 1.38x slower
recurse5f         617.39 - 1.45x slower
recurse9f         580.50 - 1.54x slower
recurse9m         543.77 - 1.65x slower
recurse5b         541.94 - 1.65x slower
recurse5m         540.25 - 1.66x slower
recurse2b         475.47 - 1.88x slower
recurse9b         398.33 - 2.25x slower

##### With input 10000000 #####
Name                 ips        average  deviation         median
recurse1            7.14      139.99 ms    ±38.68%      130.54 ms
recurse2bn          6.82      146.61 ms    ±42.16%      135.12 ms
recurse2f           5.09      196.44 ms    ±36.61%      175.40 ms
recurse9f           5.04      198.54 ms    ±32.93%      181.58 ms
recurse2fn          5.01      199.69 ms    ±32.30%      188.76 ms
recurse2b           4.30      232.61 ms    ±35.59%      200.98 ms
recurse5f           3.95      253.09 ms    ±26.26%      233.55 ms
recurse9b           3.67      272.60 ms    ±27.00%      266.37 ms
recurse5m           3.58      279.27 ms    ±27.92%      287.48 ms
recurse9m           3.48      287.72 ms    ±30.91%      259.21 ms
recurse5b           3.42      292.59 ms    ±24.17%      298.01 ms

Comparison: 
recurse1            7.14 
recurse2bn          6.82 - 1.05x slower
recurse2f           5.09 - 1.40x slower
recurse9f           5.04 - 1.42x slower
recurse2fn          5.01 - 1.43x slower
recurse2b           4.30 - 1.66x slower
recurse5f           3.95 - 1.81x slower
recurse9b           3.67 - 1.95x slower
recurse5m           3.58 - 1.99x slower
recurse9m           3.48 - 2.06x slower
recurse5b           3.42 - 2.09x slower

Follows the same pattern, though I’m not sure why unless the BEAM is able to throw-away and reuse argument positions that it used to not be able to do, also interesting… Maybe it’s some-how optimizing the 1+... bit that ‘should’ be preventing TCO…