Ingest a bitstring from least significant bit

Hi all,

I’ve finally achieved ‘Basic’ status on Elixir Forums, so I have a appropriately basic question :sweat_smile:

I’m ingesting a mask where each power of 2 aligns to an consecutive index. So, a value of 1, flags the first index, 2 → 2nd, 4-> 3rd, etc. I’m using the default big endianness of the VM, and I’m finding the following:

iex(87)> << 0::1, 1::7 >> = << 1 >>
<<1>>

… whereas I desire something like:

iex(87)> << 1::1, 0::7 >> = some_transform(<< 1 >>)
<<1>>

I’ve mucked around with big and little modifiers on either side of the match operator, but I assume the order of ingest when the bitstring is split is predetermined to be most-significant/lowest-bit to least-significant/highest-bit, low-byte to high-byte, and the input byte will always be in the same bit-order—which all makes sense. I see that there’s :binary encode and decode functions, but once again they only affect byte order, as it should.

I’ve resolved to doing:

(for << bit::1 <- << 1 >> >>, into: [], do: bit)
  |> Enum.reverse()

… to reverse the bits so that I can ingest them the way I expect. Though I suspect I’ve missed some obvious easier way, or some erlang function that might be endian-agnostic, and maybe even optimize this away to a trivial assembly operation—am I hoping too much?

Any pointers would be much appreciated :slight_smile: Thanks!

defmodule M do 
  def lastbit(bitstring) do
    size = bit_size(bitstring) - 1
    <<_head :: size(size), bit :: 1 >> = bitstring
    bit
  end

  def reverse_string_map(bitstring, f, so_far \\ [])
  def reverse_string_map(<<>>, _, so_far), do: Enum.reverse(so_far)
  def reverse_string_map(bitstring, f, so_far) do
    headsize = bit_size(bitstring) - 1
    <<head :: size(headsize), bit :: 1 >> = bitstring
    reverse_string_map(<<head::size(headsize)>>, f, [f.(bit) | so_far])
  end
end
iex> M.reverse_string_map(<<0b01001::5>>, &IO.inspect/1)                   
1
0
0
1
0
[1, 0, 0, 1, 0]

unfortunately, I have no idea where the full list of string bit length decorators is documented. Note that generally you can’t have a variable size at the head of a string, but in this case we can calculate it and go and do it explicitly. But you can’t just drop a variable directly in to the length specifier (laaaame), so you have to explicitly invoke the size() decorator.

2 Likes

This is an interesting solution, though I think it’s more complex and perhaps has worse performance characteristics than simply:

(for << bit::1 <- << 1 >> >>, into: [], do: bit)
  |> Enum.reverse()

Which also builds a list of bits and reverses it, and perhaps the compiler would have an easier time optimizing it. Ultimately it would be nice to avoid having to use list at all. I imagine reversing bits is something that many processors could do in a single instruction, but at worst case a few instructions without having to use memory beyond the CPU registers. But having a quick look around at BIFs in Erlang, it doesn’t seem that low-level operations like this are in the scope of erlang… which is sad, but no big deal I suppose

Having a bit more of think about your approach, and it inspired me to try something else:

use Bitwise

def reverse_bitstring(value) when is_bitstring(value) do
  for << bit::1 <- value >>, reduce: {0, 0} do
    {bits, write_idx} ->
      {(bit <<< write_idx) ||| bits, write_idx + 1}
  end
    |> elem(0)
end

It basically does what I want, and assuming the VM re-uses CPU registers, it might not even need to touch memory.

Endianess is not about the bit-order, but about byte order.

iex(1)> <<number::big-integer-size(16)>> = <<0, 0::6, 1::1, 0::1>>
<<0, 2>>
iex(2)> number
2
iex(3)> <<number::little-integer-size(16)>> = <<0, 0::6, 1::1, 0::1>>
<<0, 2>>
iex(4)> number
512

In a bitstring the least significant bit is the rightmost and there is nothing you can do about it.

2 Likes

Endianess is not about the bit-order, but about byte order.

True, but also I believe that the lower order bits are stored in higher bits of memory so to speak (and vice versa for little endian architectures), but at you point out, it usually doesn’t matter semantically as values are usually transferred at the byte level over a network, from a file, etc. Right?

I think when bitpacking (sub-byte level) data, it makes sense for bit strings to fill from the big-end as it will spill into the next byte - so it’s coherent semantically for the default big-endian:

iex(164)> for << bit::1 <- << 1::7, 1::9 >> >>, do: bit
[0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1]
iex(165)> << 1::7, 1::9 >>
<<2, 1>>

But as far as I know, there’s no way to bit-pack this in little-endian in Erlang, which would result in <<129, 0>>

In a bitstring the least significant bit is the rightmost and there is nothing you can do about it.

True, but what I was looking for was a way to reverse the bits. I think the solution we’ve arrived at is fine. The only thing better would be some library or BIFs that could compile to optimized assembly… but that’s probably over kill for what I need anyway.

I believe that for loop is just sugar for Enum.reduce, which will have more overhead than tco recursion solution. You could check with Benchee, but also I realized that creating a new bitstring is not the best idea, don’t know if it’s kept as a reference internally. This is probably optimal, or near-optimal:

defmodule M do 
  def reverse_string_map(s, f), do: reverse_string_map(s, f, [], bit_size(s) - 1)
  def reverse_string_map(<<>>, _, _, _), do: []
  def reverse_string_map(<<bit::1, _::bits>>, f, so_far, 0), do: Enum.reverse([f.(bit) | so_far])
  def reverse_string_map(s, f, so_far, prefix) do
    <<_ :: size(prefix), bit::1, _::bits>> = s
    reverse_string_map(s, f, [f.(bit) | so_far], prefix - 1)
  end
end

Also note that your solution, if your lambdas are effectful, your effects will land in the forward order, not the reverse order. This will do sideeffects in the correct order.

I believe that for loop is just sugar for Enum.reduce, which will have more overhead than tco recursion solution. You could check with Benchee

You might be right about that. I’m busy for now, but I’ll do some benchmarks later in the week and post them up :slight_smile:

So I ran some tests. I had to modify your code to match my function signature. Specifically you were running a function for each value, whereas mine was simply reversing the bit string. Here are the results:

Random 16-bit bitstrings

Name                                                   ips        average  deviation         median         99th %
comprehension and reverse                          23.44 K       42.67 μs   ±183.57%          40 μs          99 μs
function style 2                                   16.58 K       60.32 μs    ±55.19%          56 μs         138 μs
comprehension w/ reduce + bitwise operations       11.84 K       84.47 μs   ±126.03%          76 μs         180 μs
function style                                      6.54 K      153.01 μs    ±45.05%         135 μs         427 μs

Comparison:
comprehension and reverse                          23.44 K
function style 2                                   16.58 K - 1.41x slower +17.65 μs
comprehension w/ reduce + bitwise operations       11.84 K - 1.98x slower +41.80 μs
function style                                      6.54 K - 3.59x slower +110.34 μs

Random 8-bit bitstrings

Name                                                   ips        average  deviation         median         99th %
comprehension and reverse                          47.36 K       21.12 μs    ±82.42%          19 μs          48 μs
function style 2                                   32.93 K       30.37 μs    ±38.18%          28 μs          64 μs
comprehension w/ reduce + bitwise operations       23.59 K       42.40 μs    ±48.22%          38 μs          91 μs
function style                                     14.11 K       70.85 μs    ±41.83%          64 μs         154 μs

Comparison:
comprehension and reverse                          47.36 K
function style 2                                   32.93 K - 1.44x slower +9.25 μs
comprehension w/ reduce + bitwise operations       23.59 K - 2.01x slower +21.28 μs
function style                                     14.11 K - 3.36x slower +49.73 μs

It appears the simplest approach is the fastest, with your second implementation coming in second. I’m surprised that my second implementation is worse than my first, but it probably that erlang doesn’t permit reusing registers to do ‘in-place’ updates to values, but throws everything on the stack - or perhaps there’s some contention between registers. So it looks like the KISS principle wins on the erlang VM.

You can find the .exs for the benchmark here:

Edit: And please let me know if I’ve made any mistakes with the benchmark

1 Like