 # List traversal vs. recursive binary pattern-matching

I was working on the Luhn algorithm exercise on Exercism and took two approaches to the problem. Then I benchmarked them. I am curious to hear about which approach you would assume performed better.

As background (simplifying a bit), the algorithm involves:

• Starting with a string of presumably digits and spaces
• Strip the spaces
• Perform a calculation on each number (which calculation depends on position in the stripped string)
• Sum up the transformed numbers
• Take the remainder of the sum divided by 10

My first approach was to strip the spaces, convert the number to a charlist then a list of integers, and traverse the list of digits to do the transformation, then `Enum.sum/1` and divide.

My second approach was to strip the spaces, and recursively process the string by binary pattern-matching on the first character, performing the applicable calculation and accumulating a sum, then divide.

Which of these approaches would you expect to perform better in a simple `Benchee`-style test?

I am curious to hear what people think; if anyone is interested enough to respond with their reasoned take I’ll post the results (and the code so you can tell me what I did wrong either in the slower case or the benchmarking).

I believe that second one:

However your description is not enough clear for me …

• Is stripping spaces the first thing you do?
• How do you then fetch index?
• Do you use something like `String.split/2`?

In my opinion not only `index` should be incremented on recursive pattern-matching, but also the calculation could be done in exactly same step.

This is what I have written in few minutes …
``````defmodule Example do
def sample(string), do: string |> parse(0, [[]]) |> Enum.sum() |> Kernel.rem(10)

defp parse("", _index, [[] | acc]), do: acc

defp parse("", index, [last | tail]), do: [get_value(last, index) | tail]

defp parse(<<" ", rest::binary>>, 0, [[]]), do: parse(rest, 0, [[]])

defp parse(<<" ", rest::binary>>, index, [last | tail]) do
parse(rest, index + 1, [[], get_value(last, index) | tail])
end

defp parse(<<digit::utf8, rest::binary>>, index, [last | tail]) when digit in ?0..?9 do
parse(rest, index, [[digit | last] | tail])
end

defp get_value(list, index), do: list |> Enum.reverse() |> List.to_integer() |> calculate(index)

defp calculate(integer, index), do: integer * index
end
``````
Feel free to compare it with your current code.

In this code we have such 3 steps:

1. Parsing - just simple recursive iteration
At end of `string` and/or at any `whitespace` (except whitespaces at start) we need to reverse list (because of previously used `[head | tail]` optimization), convert it to integer and do some calculation based on the incremented (by any `whitespaces` except those at start) `index`.
2. After that we are calling `Enum.sum/1`
3. And finally calculatiing `remainder` of division by `10`
1 Like

Thanks for participating! I agree. There are some nuances to the algorithm but my second approach was reasonably close to what you offered. And here are the results (where `original_luhn` is the list version and `string_luhn` is the recursive pattern matching version:

``````Operating System: Linux
CPU Information: Intel(R) Core(TM) i5-9300H CPU @ 2.40GHz
Number of Available Cores: 8
Available memory: 15.79 GB
Elixir 1.10.3
Erlang 22.3.4.1

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 14 s

Benchmarking original_luhn...
Benchmarking string_luhn...

Name                    ips        average  deviation         median         99th %
original_luhn      440.61 K        2.27 μs   ±925.00%        2.10 μs        3.20 μs
string_luhn        325.14 K        3.08 μs  ±1930.38%        2.80 μs        3.70 μs

Comparison:
original_luhn      440.61 K
string_luhn        325.14 K - 1.36x slower +0.81 μs
``````

I found these results to be pretty surprising. I am wondering if the explanation is that I am unaware of something expensive in the string version, something optimized in the list version, or if I am benchmarking/interpreting the benchmark wrong.

Here is a link to the repo with both versions and a full explanation of the algorithm.

1 Like