Dynamic pattern matching of binaries in function heads?

Do binary patterns declared in function heads have to have static sizes? It appears to be the case, but I can’t find documentation one way or the other.

More concretely, it is possible to match a binary with a “#” as the third byte like this:

def foo(<<header::binary-size(2), "#", rest::binary>>), do: something()

But this attempt to make the size of the header dynamic results in an error:

def foo(<<header::binary-size(n), "#", rest::binary>>, n), do: something()

** (CompileError) iex:46: undefined variable "n"

Interestingly, flipping the order of the arguments to the function results in a slightly different error:

def foo(n, <<header::binary-size(n), "#", rest::binary>>), do: something()

** (CompileError) iex:46: undefined variable "n" in bitstring segment. If the size of the binary is a variable, the variable must be defined prior to its use in the binary/bitstring match itself, or outside the pattern match

That error message makes it seem like what I’m trying to do is not possible, but maybe I’m just missing something?

2 Likes

If the size itself precedes the binary data and can be decoded using bitsyntax, this can be done (assuming size is a one-byte unsigned integer):

def foo(<<n::integer, header::binary-size(n), "#", rest::binary>>), do: something()

But not much else beyond leveraging other encodings (bitsize, endianness) of the size prefix, unfortunately.

3 Likes

E.g. in a shell:

# <<n::integer, data::binary-size(n), "#", rest::binary>> = <<5, "hello#rest">>
<<5, 104, 101, 108, 108, 111, 35, 114, 101, 115, 116>>

# n
5

# data
"hello"

# rest
"rest"
1 Like

Thanks, that’s a clear explanation.

I assume this constraint is somehow related to the details of compiling function heads? Because inside of a function, I can pass a variable to the size macro:

def foo(str, n) do
  case str do
    <<header::binary-size(n), "#", rest::binary>> -> something()
  end
end

Exactly :slight_smile:

So you can just prepend the value to the binary and then it would work?

But would that have a much higher cost than doing the same with lists? Is a new full binary allocated in memory?

So you can just prepend the value to the binary and then it would work?

I definitely would not recommend this, since it will require a new binary to be allocated, as you rightly suggest.

Even if the binaries involved are expected to be small, it seems hard to justify the performance hit for such a tiny spoon of syntactic sugar.

1 Like

You could achieve what you want by having a function that removes the head of the binary until it sees the character you care about, counting how many it removes. Then apply the conditional logic.


def foo(string) do
  if parse_until(string, "#", 0) == 3 do
    something()
  else
    raise "hell"
  end
end

def parse_until(<<char::binary-size(1), rest::binary>>, char, previous_character_count) do
  previous_character_count
end
def parse_until(<<_head::binary-size(1), rest::binary>>, char, previous_character_count) do
  parse_until(rest, char, previous_character_count + 1)
end

This works because in Elixir if you use the same variable name for two different variables in a function head Elixir will try to match them and will only enter the function body if they are the same.