Why does String.to_integer/1 give strange results with with some values?

Hi folks, I’ve been running into some very strange behavior that has me knocking my head against the wall.

If I run the following code:

["42", "12", "15"]
|> Enum.map(&(String.to_integer(&1)))

I get what you’d expect: the list [42, 12, 15]

But if I run this code:

["41", "92", "73", "84", "69"] 
|> Enum.map(&(String.to_integer(&1)))

I get this weird string: ~c")\\ITE"

In fact, if I run just this:

["41"] 
|> Enum.map(&(String.to_integer(&1)))

I get ~c")"

What on earth is going on?!

I figured “maybe there’s some unicode strageness going on with LiveBook” so I tried it in iex, with the same results:

1 Like

congrats on reaching this milestone - and completing a major step/rite of passage on your elixir journey - it’s a charlist Binaries, strings, and charlists — Elixir v1.17.0-dev

we have all been there - and gone through the same wtf etc.

17 Likes

Try this to force lists of integers to display as lists of integers (reference: Inspect.Opts — Elixir v1.12.3):

["41", "92", "73", "84", "69"] 
|> Enum.map(&(String.to_integer(&1)))
|> IO.inspect(charlists: :as_lists)
2 Likes

That this is not the default is a little crazy to me.

4 Likes

It is a tribute to the Erlang community :slight_smile:

5 Likes

Yeah, you are basically seeing the runtime trying to be a bit too helpful – if a list of integers looks like a list of printable characters, then the REPL will instead show them as a printable string. The data is still there and is still the same.

If you want to get rid of that behavior in your project, just create an .iex.exs file at the root of your project and put this in there:

IEx.configure(inspect: [charlists: :as_lists])

Restart your iex session and, et voila:

iex(1)> ["41"] |> Enum.map(&String.to_integer/1)
[41]
6 Likes

The problem I run into is that it does not apply to any IO.inspect calls within the code. You have to remember to do IO.inspect(number_list_that_isnt_supposed_to_be_chars, charlists: :as_lists) every time you might be faced with this weirdness.

2 Likes

I think it’s time to make charlists: :as_lists to be a default option, and deprecate the charlists: :infer option.

3 Likes

I think that would just surface how often we actually run into charlists when interacting with erlang libraries. E.g. hardly any elixir app doesn‘t use gen_tcp somewhere.

From my personal experience I‘ve been hitting the case of needing to see the underlying list of integers mostly in tests or puzzles, where there is time and opportunity to deal with the printed formatting. Not optimal, but fine.

Interacting with actual charlists on the other hand comes up when you‘re debugging some production issue and trying to dig deeper into some failure, where you don‘t want to deal with first setting up useful formatting rules to be able to understand what the system is trying to provide in information.

6 Likes

Hehe
I was right here

2 Likes

At least now there is the chance that someone can look up the documentation for the ~c sigil. Just because it didn’t happen in this case it does not mean it’s not an improvement.

holly shhhh how did I miss that! Would that work with assertions erros in exunit? I cannot try right now but that would be great. Especially in december.

That makes a ton of sense, but does not convince me that the current default is optimal. Feels like optimizing for the people who would know how to get the result they want while making it confusing for the people who have no clue. From my perspective as someone not doing enterprise development I’d rather optimize for reducing the learning curve and then have a configuration option for the more advanced users/developers.

1 Like

You can run into this as soon as you deploy your first phoenix app and have it slightly misconfigured - especially if ssl is somehow involved. This is not some niche issue.

2 Likes

I didn’t mean to imply it was niche, but even your example is a few steps beyond the beginner level. Someone configuring a phoenix app should probably be comfortable with other configuration settings where a neophyte should not be expected to know this unique quirk of the language out of the box.

They‘re not expected to know the that. There‘s a reason charlists default to a sigil in recent elixir versions - to add signal to charlists being something to inform yourself about. Similar to how you‘d look up what ~w does for example.

My argument is not to pretend this is not a problem, but it is about which default is more problematic. In my experience I‘ve run into charlists far more often in errors surfaced from erlang than I‘ve run into a list, which happens to hold smallish integer values, but wasn‘t text. Erlang errors can already be cryptic enough, printing them as lists of numbers would make them far worse. And it‘s not just experienced people running into erlang errors.

4 Likes

I understand. We have had different experiences, which is why we seem to be valuing the default setting differently. The regularity with which newcomers ask the question along with the apparent fact that his is a unique quirk to Erlang among all programming languages everywhere AFAICT makes it feel like an unnecessary stumbling block to newcomers. You disagree on the “unnecessary” part of that statement. That’s fine. But nothing you have said negates my experience.

1 Like

Yeah. I’ve seen my fair share of all of those questions. I however don’t think they’ll go away by changing the default. They’d just change to “why the hell do I get those garbage error messages”.

2 Likes

IEx tools are more geared toward ease of development than learning, which makes sense to me. It’s happened at least once where someone’s confusion was caused because they didn’t realize they weren’t getting the whole struct when using IO.inspect or dbg. Those tools, being the ones you first discover, are optimized for debugging over learning which is what I think the charlist default is probably going for as well (based on @LostKobrakai’s thoughts).

I’m indifferent to the default changing (not that’s it’s even on the table) since I never use charlists and I don’t mind the same question being asked over and over. It’s not a bad intro to the forum and we’d otherwise be missing out on amusing answers like @outlog’s :slightly_smiling_face:

Check this out:

Interactive Elixir (1.14.4) - press Ctrl+C to exit (type h() ENTER for help)

iex(1)> [97,98,99]
'abc'

iex(2)> [97,98,99,0]
[97, 98, 99, 0]

iex(3)> 'abc' === [97,98,99]
true

iex(4)> Enum.map('abc', fn n -> n*2 end)
[194, 196, 198]
1 Like