Binary concatenation vs binary building syntax

I was looking at Elixir’s implementation of OptionParser, and noticed that it adds a character to the end of a string like this: <<buffer::binary, h>> (see here).

Is there an advantage to doing that over buffer <> <<h>>, in terms of performance or something else, or is that just a matter of syntactic preference?

<>/2 just expands to <<…>>.

11 Likes

Hmm, but would x <> y <> z expand to <<x::binary, y::binary, z::binary>> or <<<<x::binary, y::binary>>, z::binary>>?

Or <<x::binary, <<y::binary, z::binary>>>>?

I’m not sure if there is a performance difference between the first two (maybe barely), but the third would be problematic I think.

1 Like

This one. extract_concatenations/2 extracts individual applications into a list, then the unquote_splicing/1 unrolls a list into a single flat AST.

4 Likes

The only difference is that the compiler has to do a bit more work, I tend to prefer the more explicit definition, since I believe there is a lot more benefit in teaching the reader about binary pattern matching then hiding it in an compiler abstraction.

3 Likes

Yeah, I switched to the bitstring syntax in XKS out of paranoia because I wasn’t sure what <> actually compiled to. I guess it didn’t matter in this case, but in general it’s easier not to have to think about it.

1 Like

FWIW, all the ways to write “combine three binaries into one binary” produce identical assembly:

defmodule CombineBinaries do
  @compile :S

  def all_concat(a, b, c) do
    a <> b <> c
  end

  def nested_binary(a, b, c) do
    << <<a::binary, b::binary>>::binary, c::binary>>
  end

  def nested_binary_flipped(a, b, c) do
    <<a::binary, <<b::binary, c::binary>>::binary >>
  end

  def triple_binary(a, b, c) do
    <<a::binary, b::binary, c::binary>>
  end
end

All of the resulting functions look like this:

{function, all_concat, 3, 11}.
  {label,10}.
    {line,[{location,"foo.ex",4}]}.
    {func_info,{atom,'Elixir.CombineBinaries'},{atom,all_concat},3}.
  {label,11}.
    {line,[{location,"foo.ex",5}]}.
    {bs_create_bin,{f,0},
                   0,3,8,
                   {x,0},
                   {list,[{atom,append},
                          1,8,nil,
                          {x,0},
                          {atom,all},
                          {atom,binary},
                          2,8,nil,
                          {x,1},
                          {atom,all},
                          {atom,binary},
                          3,8,nil,
                          {x,2},
                          {atom,all}]}}.
    return.

bs_create_bin was introduced in OTP25 as a new way to construct binaries from segments.

6 Likes

Good to know, I did not expect the compiler to optimize to this degree tbh.

How far will it go? E.g.

a = <<y::binary, z::binary>>
b = <<x::binary, a::binary>>

Is it smart enough to rewrite this or will the resulting asm construct a via copying first?

Also, are there docs or tips for how I can learn to produce and read the assembly myself?

It is not hard to optimise the above as Kernel.<>/2 is just macro. So it is not “compiler optimising” (at least not Elixir compiler) but it is just macro expansion.
About your example - it is up to Erlang compiler to do such optimisations AFAIK.

1 Like

Oh yes I understood that, but the example in @al2o3cr 's post (nested_binary_flipped) is a compiler optimization, no? The bitstring syntax, not the concat operator.