IO problems - (IO.StreamError) error during streaming: unknown POSIX error

Hello

I am trying to deal with stdio and face strange (for me) behavior when reading bytes. To reproduce it I did simple one line bash script:

elixir -e ':binary.list_to_bin( for n <- 0..255, into: [], do: n) |> IO.binwrite()' | elixir -e 'IO.binstream(:stdio, 1) |> Enum.each(&(&1))'

and the error:

** (IO.StreamError) error during streaming: unknown POSIX error
    (elixir 1.14.5) lib/io.ex:752: IO.each_binstream/2
    (elixir 1.14.5) lib/stream.ex:1625: Stream.do_resource/5
    (elixir 1.14.5) lib/enum.ex:4307: Enum.each/2
    nofile:1: (file)
    (stdlib 4.3) erl_eval.erl:748: :erl_eval.do_apply/7

Actually it works without Enum.each but I need this data )
Did I missed some thing or maybe there is better way to read from stdio?

Elixir 1.14.5-otp-25
Erlang 25.3
OS Ubuntu 22.4

There are two problems with this code:

  • the code that writes doesn’t actually produce the bytes you expect

  • the code that reads is trying to read binary data from a device set to UTF8 encoding


The first issue, as the documentation for IO.binwrite mentions:

Important: do not use this function on IO devices in Unicode mode as it will write the wrong data. In particular, the standard IO device is set to Unicode by default, so writing to stdio with this function will likely result in the wrong data being sent down the wire.

The actual data that the first half of that pipeline prints out is 384 bytes long, because each byte of the second half of the data is translated to the corresponding Unicode character (so 128<<194, 128>>

Calling :io.setopts(:standard_io, [encoding: :latin1]) before binwrite avoids this behavior and writes exactly the 256 bytes expected.


The second issue is that IO.binread also behaves unexpectedly with a Unicode device. Reading a single byte from it will fail if bytes farther along in the stream are invalid Unicode. so:

echo $'\x01\x02' | elixir -e ':file.read(:standard_io, 1) |> IO.inspect; :file.read(:standard_io, 1) |> IO.inspect'

# prints
{:ok, <<1>>}
{:ok, <<2>>}


# but this doesn't work even though the first two bytes are both OK
echo $'\x01\x02\xFF' | elixir -e ':file.read(:standard_io, 1) |> IO.inspect; :file.read(:standard_io, 1) |> IO.inspect'

# prints
{:error, :collect_chars}
:eof

This {:error, :collect_chars} is swallowed by higher levels of the IO machinery and produces the “unknown POSIX error” explanation you’re seeing.

As in the first case, setting the encoding correctly produces the correct result:

echo $'\x01\x02\xFF' | elixir -e ':io.setopts(:standard_io, [encoding: :latin1]); :file.read(:standard_io, 1) |> IO.inspect; :file.read(:standard_io, 1) |> IO.inspect'

# prints
{:ok, <<1>>}
{:ok, <<2>>}

(note: this is the same collect_chars error shape as in this ticket that @josevalim just filed, but I don’t believe it’s directly related)


Putting both together does what you want:

elixir -e ':io.setopts(:standard_io, [encoding: :latin1]); :binary.list_to_bin( for n <- 0..255, into: [], do: n) |> IO.binwrite()' | elixir -e ':io.setopts(:standard_io, [encoding: :latin1]); IO.binstream(:stdio, 1) |> Enum.each(&IO.inspect/1)'

# prints
<<0>>
...
<<255>>
4 Likes

There was stupid mistake from my side thinking that using bytes allow to ignore io device type. Thank you for such detailed explanation. It was very helpful and fast.