Apparent regression reading binary data from stdio in Erlang 26

Upon upgrading to Erlang 26, I noticed that the protoc-gen-elixir plugin to the protobuf compiler no longer works. The plugin starts off by reading :stdio as a binary stream, and decoding the contents:

    # See https://groups.google.com/forum/#!topic/elixir-lang-talk/T5enez_BBTI.
    :io.setopts(:standard_io, encoding: :latin1)

    # Read the standard input that protoc feeds us.
    bin = binread_all!(:stdio)

    request = Protobuf.Decoder.decode(bin, Google.Protobuf.Compiler.CodeGeneratorRequest)

The binread_all!() function calls IO.binread(), and this succeeds sometimes, fails with {:error, :collect_chars} at other times, and reads a truncated file in yet other situations, depending on the exact input args to protoc. This all works with Erlang 25 / Elixir 1.14, and with the Python protoc plugin.

Minimal Test Case

With Elixir 1.14, Erlang 25:

$ echo -n $'\xDF\xBE' | elixir -e ':io.setopts(:standard_io, [{:encoding, :latin1}]); IO.binread(:standard_io, :eof) |> IO.inspect()'
"\x{7FE}"

With Elixir 1.15 / Erlang 26.0.2:

$ echo -n $'\xDF\xBE' | elixir -e ':io.setopts(:standard_io, [{:encoding, :latin1}]); IO.binread(:standard_io, :eof) |> IO.inspect()'
{:error, :collect_chars}

I see similar results using :file.read() instead of IO.binread(). Any idea if this is a known issue?

This issue sounds related:

1 Like

Okay, so someone did implement Jose’s proposed fix, of adding a kernel argument to set the stdin encoding: Fix problems when reading from stdin and "-noshell" was passed to erl (or used through an escript) by garazdawi · Pull Request #7384 · erlang/otp · GitHub

But how would I specify kernel arguments in an escript?

So far I’ve been able to use this as a workaround:

export ERL_AFLAGS='-kernel standard_io_encoding latin1'

It’s a little awkward since it has to be specified at runtime, not escript build time. I think I can do the same thing with emu_args, but haven’t tried it yet.