This is not properly encoded UTF-8, if you know the input encoding, you can use some conversion library to convert from one encoding to another.
edit
also please remember, that anything >= 128 is not ASCII, ASCII has only 7 bit. There are some 8 bit encodings that sometimes are refered to as 8-Bit-ASCII or extended ASCII, but thats not their “true” name.
This depends entirely on the encoding. To print in IEx you need to have the string utf8 encoded, which may mean converting from latin1 or whatever the source encoding is.
so how do you deal with an UNKNOWN encoding upfront?
I’m reading from JBASE, i have no idea what encoding they use. Displaying in IEx is not really a requirement, butit's just a safety check, i’m basically going from JBASE to XML
The thing is, i’m losing characters if filter like so:
def ascii(v) do
if String.printable?(v) do
v
else
for <<c <- v>>, c in 32..126 || c in [252, 253, 254], into: "", do: <<c>>
end
end
You don’t. You can try guessing of course, but ultimately if you don’t know the encoding then you have no idea what characters are represented by values 128-255.