The dangers of binaries :binary.part vs String.slice 100x speedup

erlang

#1

So Iv been working on Morphlings diff algo to create a dom differential to send over the wire wasting as little CPU/Memory as possible and wanted to talk about a point when dealing with binaries.

This is how you shoot yourself in the foot and then turn around and say BEAM/Erlang/Elixir is very slow and bad.

100x Slower code with linear slowdown

    fn_chunk = fn(bin,parts)->
        size = byte_size(bin)
        segment_size = div(size, parts)+1
        {chunks,_} = Enum.reduce_while(0..(parts*2), {[], bin}, fn(idx,{a,bin})->
            piece = String.slice(bin, 0, segment_size)
            case piece do
                "" -> {:halt, {a,bin}}
                _ ->
                    a = a ++ [%{old_pos: idx*segment_size, size: segment_size, binary: piece}]
                    {:cont, {a, String.slice(bin, segment_size, size)}}
            end
        end)
        chunks
    end

Fast code

fn_chunk = fn(bin,parts)->
        size = byte_size(bin)
        segment_size = div(size, parts)+1
        {chunks,_} = Enum.reduce_while(0..(parts*2), {[], 0}, fn(_,{a,idx})->
            to_take = min(byte_size(bin)-idx, segment_size)
            piece = :binary.part(bin, idx, to_take)
            case piece do
                "" -> {:halt, {a,bin}}
                _ ->
                    a = a ++ [%{old_pos: idx, size: to_take, binary: piece}]
                    {:cont, {a, idx+to_take}}
            end
        end)
        chunks
    end

What this piece does is, it splits a binary into x amount of equal parts. On a binary of size 20k, the above slow code takes 100ms, on a 5k size binary takes 43ms.

The fast code takes 0-1ms on a 20k size binary.

What is the difference?


#2

String.slice/3 works on Unicode graphemes vs :binary.part/3 which is just looking at raw bytes. The documentation mentions this and suggests Kernel.binary_part/3 if you dont care about graphemes https://hexdocs.pm/elixir/String.html#slice/3


#3

What if you just pattern match? << part :: binary-size(n), rest :: binary>> = bin where n is number of bytes for each part.


#4

There is a lot of options to consider, what I was getting at is its a little confusing to know which operations make copies and which refer to the original binary on the shared binary heap. And this was a good performance refresher for me.


#5

I would assume the second version is using sub binary. Instead of allocating a new binary, a reference to the original is created. I agree, from the documentation it’s not clear which function will allocate a new one vs share the original binary.

Be careful though, the original binary will not be garbage collected till all the sub binaries are garbage collected. This has a tendency to cause memory leak. Let’s say you take the header part of a large binary and keep it in genserver state, the garbage collection of the large binary will be delayed. binary:copy/1 can be used to create a independent copy. https://hexdocs.pm/nimble_csv/NimbleCSV.html#module-binary-references