Not corrupted, a
to_string on a char_list returns the string that the char_list represents, so that is what they are. As strings and char_lists are convertible between each other then it is impossible to know what originally it was before a
to_string is inherently a lossy conversion (it loses the original type information to make everything just a ‘binary’).
If your normal strings are constrained to, say, the 32-127 range then you can just test if the char values are outside of that and if so then leave it as a char-list or else leave it as a binary or so.
Remember, a list with all elements being integers is a char_list and can be treated as a string in many places, and is treated as such when converting it to a binary. If someone wants to store arbitrary data in a string/binary field then they should encode it somehow, such as via
:erlang/binary_to_term/2 or via Jason for json encoding or something. Just converting anything to string generally makes it inaccurately reversible at best and makes it unreversible for the great great majority of datatypes.
"AKA" is string but list after
to_charlist applied but I dont wanna convert this to array as its not supposed to be converted. I only wanna filter and update values in database which were corrupted eg
They aren’t being converted to an array but rather to character lists, which is just a list of exclusively integral parts. You can see more information in the repl:
iex(1)> i 'AKA'
This is a list of integers that is printed as a sequence of characters
delimited by single quotes because all the integers in it represent valid
ASCII characters. Conventionally, such lists of integers are referred to
as "charlists" (more precisely, a charlist is a list of Unicode codepoints,
and ASCII is a subset of Unicode).
[65, 75, 65]
IEx.Info, List.Chars, Inspect, Collectable, String.Chars, Enumerable
Raw representation section, a charlist string is a list of integers, just like a binary string is an array of 8-bit integers:
iex(3)> i "AKA"
This is a string: a UTF-8 encoded binary. It's printed surrounded by
"double quotes" because all UTF-8 encoded codepoints in it are printable.
<<65, 75, 65>>
IEx.Info, List.Chars, Inspect, Collectable, String.Chars
Raw representation as well. Same numbers, one is encoded as a list, the other as a binary array.