List decoding

Hey!
What does it mean?

> [31, 32, 33] - [31, 32]
'!'
> IEx.Info.info '!'
[
  {"Data type", "List"},
  {"Description",
   "This is a list of integers that is printed as a sequence of characters\ndelimited by single quotes because all the integers in it represent valid\nASCII characters. Conventionally, such lists of integers are referred to\nas \"charlists\" (more precisely, a charlist is a list of Unicode codepoints,\nand ASCII is a subset of Unicode).\n"},
  {"Raw representation", "[33]"},
  {"Reference modules", "List"}
]
> '!' == [33]
true

I need a normal list

That is a list. It is just “helpfully” printed by IEx like it was an Erlang string (charlist).

In Erlang, strings are represented as charlists (character list), that are a list of integers that represent Unicode codepoints. So the list [33] is interpreted as a charlist with one codepoint, 33, which is the exclamation mark.

When using IO.inspect (or plain inspect), you can give an option to print it as a list instead:

iex(1)> IO.inspect([33], charlists: :as_lists)
[33]
'!'

Using IEx’s i gives roughly the same information as I did:

iex(2)> i '!'
Term
  '!'
Data type
  List
Description
  This is a list of integers that is printed as a sequence of characters
  delimited by single quotes because all the integers in it represent valid
  ASCII characters. Conventionally, such lists of integers are referred to
  as "charlists" (more precisely, a charlist is a list of Unicode codepoints,
  and ASCII is a subset of Unicode).
Raw representation
  [33]
Reference modules
  List
Implemented protocols
  Collectable, Enumerable, IEx.Info, Inspect, List.Chars, String.Chars
7 Likes

Elixir actually does exactly the same thing with binaries/strings. So strings are just binaries with UTF-8 encoded code points and when a binary is printed then its contents are checked and if its contents can represent UTF-8 code points then it is printed as a string otherwise as a binary

iex(1)> "abc"
"abc"
iex(2)> <<97,98,99>>
"abc"
iex(3)> <<97,98,99>> == "abc"
true
iex(4)> "™€é" 
"™€é"
iex(5)> <<"™€é"::binary,0>>
<<226, 132, 162, 226, 130, 172, 195, 169, 0>>

The reason for this in both Erlang and Elixir is that there is no string datatype, no character datatype either for that matter, so they have to be faked.

4 Likes