Building binary is slow

Hi y’all,

I’m in the process of building a binary encoder/decoder. While benchmarking it, I had to generate binaries to feed to it. I noticed that it’s rather slow to do so.

{microseconds, _} = ->
  bufWorst = for _ <- 1..768000, do: <<224>>

That took about 145_000 microseconds on my MacBook Pro with 2.9 GHz Quad-Core Intel Core i7 and 16GB RAM.

This kinda surprised me, as I was under the impression that Erlang could handle binaries very efficient.

To contrast, creating the same buffer in Node.js took just 2_437_972 nanoseconds, or just 2_438 microseconds:

const start = process.hrtime.bigint();
const bufWorst = new Buffer(768000);
const end = process.hrtime.bigint();
console.log(`Constructing bufWorst took ${end - start} nanoseconds`);

Am I doing something wrong?

String.duplicate(<<224>>, 768000) takes ~4500 microseconds on my computer, I believe it’s the best optimized for this use case.

The version you have is doing a lot more, you’ve got a full blown comprehension that has to iterate through a range and collect results, and then finally stitch those results into a binary. String.duplicate calls :binary.copy which is a BIF.

Notably though, constants like this would be more idiomatically generated at compile time and then simply referenced at runtime:

@buffer String.duplicate(<<224>>, 768000)

This avoids runtime overhead entirely.


will that make a difference if you use recursion

  def concat(n), do: concat(n, [])

  def concat(0, acc), do: :erlang.iolist_to_binary(acc)

  def concat(n, acc), do: concat(n-1, [<<224>> | acc])

  {microseconds, _} = ->
1 Like

Wow, that is a huge difference! Not as close as Node.js, but this is a much better improvement. I swear, I have been looking at the manual page for :binary the whole time when working on the encoder, and yet I missed :binary.copy/2 completely. Thanks!

My encoder is still very slow compared to the Node.js version, but I can see that there are ways to improve binary handling.

1 Like

It did improve from my version, from 145_000 microseconds to 55_333, but definitely can’t beat the 4_500 from the BIF.

1 Like

Definitely try to make use of pattern matching wherever possible, it is very well optimized.