Separating two integers from the upper/lower bits of a single 32 bit value from little endian

I have a bitstring:

<<3, 0, 0, 0>>

that i’m trying to split into two values like so:

<<chunk::little-unsigned-size(31), is_first_chunk::little-unsigned-size(1)>> = <<3, 0, 0, 0>>

following this specification:

chunk/isFirstChunk (upper 31bits/lowest bit)

“chunk” and “isFirstChunk” are combined into an unsigned 32bit value. Therefore it will be encoded as

uint32_t chunkX

and extracted as

chunk = chunkX >> 1

isFirstChunk = chunkX & 0x1

and I’m expecting both values to be 1 but it’s actually 3 and 0 respectively.

Am I wrong?

it is correct.

<<3, 0, 0, 0>> is a bitstring, encoded as a 4 bytes it is 0x03 0x00 0x00 0x00 and as a 32bit integer it becomes 50331648.

50331648 & 0x01 is zero, because the last byte of the integer is not set (the fourth zero in the four bytes sequence)

0x03 0x00 0x00 0x01 (50331649) would give you 1.

iex(11)> <<chunk::little-unsigned-size(31), is_first_chunk::little-unsigned-size(1)>> = <<3, 0, 0, 1>>
<<3, 0, 0, 1>>
iex(12)> is_first_chunk
1

iex(17)> <<chunkX::32>> = <<3, 0, 0, 1>>
<<3, 0, 0, 1>>
iex(18)> require Bitwise
Bitwise
iex(19)> Bitwise.&&&(chunkX, 0x01)      
1
iex(20)> chunkX
50331649

iex(21)> <<chunkX::32>> = <<3, 0, 0, 0>>
<<3, 0, 0, 0>>
iex(22)> Bitwise.&&&(chunkX, 0x01)      
0
iex(23)> chunkX                         
50331648

iex(24)> Bitwise.>>>(chunkX, 1)   
25165824 <-- chunk

EDIT: you can convert the integer back to a bitstring with

iex(25)> :binary.encode_unsigned(50331648)
<<3, 0, 0, 0>>
iex(26)> :binary.encode_unsigned(50331649)
<<3, 0, 0, 1>>
2 Likes

I was just advised that for little endian values:

In binary, you have 00000011 00000000.....
flip the bytes to <<0, 0, 0, 3 >>
now you have ...00000 00000011
so, chunk 1, and isFirst 1

Does this make sense?

If so, how would you encode/decode arbitrary values of chunk, assuming is_first_chunk is always 1 or 0?

Edit: Removing the last byte of 0b00000011:

0b0000001
0b0000010
0b0000011
...

and so on?

Correct.
Elixir defaults to big endian format when manually crafting binaries.

be aware that most networking protocols are big endian, while almost any common CPU architecture is little endian.

Se if you’re decoding network streams, most probably the will be big endian (most significant byte first)

You can specify endianness in this way

iex(30)> :binary.decode_unsigned(<<3, 0, 0, 0>>, :little)
3
iex(31)> <<chunk::32-little> = <<3, 0, 0, 0>>
iex(32)> chunk
3

encoding

iex(1)> <<3::32-little>>
<<3, 0, 0, 0>>
iex(5)> <<0b0000001::32-little>>
<<1, 0, 0, 0>>
iex(6)> <<0b0000010::32-little>>
<<2, 0, 0, 0>>
iex(7)> <<0b0000011::32-little>>
<<3, 0, 0, 0>>
iex(8)> <<0b0000100::32-little>>
<<4, 0, 0, 0>>
4 Likes

Yup, I’m workin on VelocyStream (ArangoDB).

With your insight, it became clear. To encode for an outgoing chunk:

chunk_x =
  <<chunk::31, is_first_chunk::1>>
  |> :binary.decode_unsigned(:little)
  |> :binary.encode_unsigned(:big)

and decode from an incoming chunk:

<<chunk::31, is_first_chunk::1>> =
  chunk_x
  |> :binary.decode_unsigned(:big)
  |> :binary.encode_unsigned(:little)

I really appreciate it, thank you.

1 Like