Where did the name "binaries" come from? And how does this relate to Base2

Whoops – my bad copy and paste.

Isn’t this to_charlist/1? Or is that working on graphemes?

I don’t know. I’m kinda weirded out that if I want to have a variable that stores a list of integers, it seems that it will be automagically converted to a readable string if I happen to be storing integers that correspond to code points:

iex> my_list = [-1, 99, 97, 116]
[-1, 99, 97, 116]
iex> my_list = [99, 97, 116]
'cat' # <-- wuuuut?!?

I mean… what if I wanted to store the scores of basketball games of something.

As [99] and 'c' are just different representations of the same value, who cares? Similar to how <<99>> and "c" are just the same…

1 Like

A list of integers can be presented as text surrounded with single quotes if the integers are valid code points <128. But that is only how the list is presented, it is still really a list of integers. This is similar to how binaries that are sequences of bytes can be represented as strings if the binary is valid and printable UTF-8.

FAQ: why-is-my-list-of-integers-printed-as-a-string

2 Likes

who cares?

Many people, myself included, find this conflation of representations and the dismissal of distinctions very, very confusing. There is a reason we don’t code in assembly, for example, or that we don’t plötzlich mitten im Satz deutsch sprechen.

Yeah, that native Erlang functionality smells to me. I would much prefer to opt-in to that kind of surprising behavior. It’s bad dev-UX IMO.

1 Like

But its a representation only visible to you as developer during debugging your stuff. It is a product of the Inspect protocol, which’ output is not meant to be end-user facing.

When you prepare output for the end-user you should prepare your output by other means.

Until then it does not make any difference if you see 'cat' or ~c"cat" or [99, 97, 116] in iex, as all 3 are just the same values.

iex(1)> 'cat' = ~c"cat" = [99, 97, 116]
'cat'
1 Like

It’s wrong in some cases but the reverse would be just as wrong in other cases:

iex> 'cat'
 [99, 97, 116]
4 Likes

A little bit yes, but now imagine situation when GenServer crashes and instead of text you get hell lot of lists with meaningless numbers in them.

5 Likes

Thank you all for your input – this has been very educational. I have submitted a PR to the https://elixir-lang.org/getting-started/binaries-strings-and-char-lists.html page that I hope will better explain the concepts at play here.

6 Likes

Do you have a link to the PR as well?

José already merged it, so it’s live.

1 Like

I prefer to read the diffs, I probably won’t even see whats changed, as I’ve gone through that page roughly at 1.0 release time…

The diffs are going to be tough on that one because I essentially rewrote the entire page.

1 Like

the commit is https://github.com/elixir-lang/elixir-lang.github.com/commit/f7c1cce6732ac7b7f2a4ae3e8ff5f8c8c8f91cd2

@fireproofsocks a lot of us are pretty sufficient at reading/skimming diffs aka commits/PRs :wink:

great work on the docs!

2 Likes

Great work, thanks for taking the time to contribute!

1 Like

In my mind this is similar to binary vs textual files. A binary file is any sequence of bytes (not bits), whereas a textual file is a special sequence of bytes. Similarly, in Erlang/Elixir, a binary is any sequence of bytes, whereas a string is a special sequence of bytes. I’m quite certain that this terminology is not invented by Erlang. Therefore, I’m not completely convinced by this passage. Perhaps we should summon @rvirding for explanation :slight_smile:

The Erlang term for this is bitstring:

A bitstring is a sequence of zero or more bits, where the number of bits does not need to be divisible by 8. If the number of bits is divisible by 8, the bitstring is also a binary.

I’d be happy to make corrections if clarifications can be given!

If only we adopted gallic sensibility and abolished bytes in favor of octets!! Where I work we’re constantly having to divide by 8 to go from Gbps to GB/s