I was trying to refactor some binary processing code when I stumbled upon the problems described in this very nice post. The Rule 3 in particular was challenging, because in my case the “packet” length is variable and we can’t know it beforehand.
After giving up on having a separate function to handle it, I tried to at least turn it into a macro, but then I was surprised by the compiler not being able to optimize my match once it leaves the case
statement.
Given this very contrived example:
defmodule Binaries do
def scan1(<<>>), do: :ok
def scan1(<<0::1, bin::bits>>), do: scan1(bin)
def scan1(<<1::1, bin::bits>>), do: scan1(bin)
def scan2(<<>>), do: :ok
def scan2(<<bin::bits>>) do
case bin do
<<0::1, rest::bits>> -> scan2(rest)
<<1::1, rest::bits>> -> scan2(rest)
end
end
def scan3(<<>>), do: :ok
def scan3(<<bin::bits>>) do
rest = case bin do
<<0::1, b::bits>> -> b
<<1::1, b::bits>> -> b
end
scan3(rest)
end
end
When we compile it with ERL_COMPILER_OPTIONS=bin_opt_info
turned on, we can see that the compiler can optimize all the binary matches except for the last one:
warning: OPTIMIZED: match context reused
binaries.exs:4
warning: OPTIMIZED: match context reused
binaries.exs:10
warning: BINARY CREATED: binary is used in call to scan3/1 which doesn't support context reuse
binaries.exs:20
Even though it is clearly a tail call and the match has a predictable size.
I know that making the code both pretty and fast is impossible sometimes. But anyway, is there anything else I could do to “hint” the compiler in this case?