IO.puts randomly throwing errors

poops · May 2, 2018, 2:48pm

I have a cron that uses IO.puts to ouput random debug strings. It has never thrown an error when I manually run it, but randomly throws errors when run un-attended (to be clear, not EVERY time)

From HoneyBadger:

Erlang error: :terminated
unknown ? put_chars(:standard_io, :unicode, ["Tracking 42 orders...", 10])

The elixir code that’s causing this is

prefix = if real?, do: "", else: "[DRY RUN] "
IO.puts("#{prefix}Tracking #{Enum.count(orders)} orders...")

orders is a list returned from Repo.all.

Output of elixir -v:

Erlang/OTP 20 [erts-9.2] [source] [64-bit] [smp:2:2] [ds:2:2:10] [async-threads:10] [hipe] [kernel-poll:false]

Elixir 1.6.1 (compiled with OTP 19)

Any idea what’s going on here? Thanks for any help!

idi527 · May 2, 2018, 5:35pm

Seems like there is no put_chars/3, only put_chars/1 and put_chars/2. Or maybe there is but it’s not public. Anyway, when called with a debugger, I only see put_chars/2.

iex(22)> :dbg.tracer()
{:error, :already_started}
iex(23)> :dbg.p(:all, :c)
{:ok, [{:matched, :nonode@nohost, 171}]}
iex(24)> :dbg.tpl(:io, :put_chars, :x)
{:ok, [{:matched, :nonode@nohost, 2}, {:saved, :x}]}

iex(25)> IO.puts("#{prefix}Tracking 42 orders...")
[DRY RUN] Tracking 42 orders...
(<0.335.0>) call io:put_chars(standard_io,[<<"[DRY RUN] Tracking 42 orders...">>,10])
(<0.335.0>) returned from io:put_chars/2 -> ok

otp/lib/stdlib/src/io.erl at master · erlang/otp · GitHub ← the source code for the :io module.

NobbZ · May 2, 2018, 5:41pm

Looks like a race condition. The process associated with the atom :standard_io is gone when that particular piece of text is written.

poops · May 2, 2018, 10:28pm

You’re right, there is no put_chars/3 from what I can tell, and when I run w/ the debugger I get the same call to put_chars/2 as you do. But if I search for :io.put_chars(:standard_io, :unicode, I end up seeing results for other code snippets that end up calling it.

I’m wondering if it’s better to call :io.put_chars(:standard_io, ["#{prefix}Tracking #{Enum.count(orders)} orders...", 10]) instead of using IO.puts

Also, one other thing that I forgot to mention is that I’m running this through an RPC call, on a release built with distillery.

OvermindDL1 · May 2, 2018, 10:40pm

Hmm an rpc call forwards the stdin/stdout, however it returns when the call returns, so if it can return before the function actually finishes work (messages for example, and yes putting to IO does involve sending a message) then you get a race condition, so I can see why it would do that assuming I don’t have something grossly wrong here. ^.^;

rvirding · May 3, 2018, 8:36pm

A simple way of check which exports a module, both erlang and elixir, has is to use the :c module. :c.m/1 lists info about a specific module while :c.m/0 lists all the modules currently loaded and from which file they were loaded. For example

iex(1)> :c.m(:io)
Module: io
MD5: feeaeb59c215e500415547b75e637c59
Compiled: No compile time info available
Object file: /usr/local/lib/erlang/lib/stdlib-3.4.2/ebin/io.beam
Compiler options:  [debug_info,
                    {i,"/Users/vagrant/package-buildscripts/scripts/esl-erlang/mac/otp/lib/stdlib/src/../include"},
                    {i,"/Users/vagrant/package-buildscripts/scripts/esl-erlang/mac/otp/lib/stdlib/src/../../kernel/include"}]
Exports: 
columns/0                     parse_erl_form/2
columns/1                     parse_erl_form/3
format/1                      parse_erl_form/4
format/2                      printable_range/0
format/3                      put_chars/2
fread/2                       put_chars/1
fread/3                       read/1
fwrite/1                      read/2
...
module_info/0                 scan_erl_exprs/3
module_info/1                 scan_erl_exprs/4
nl/0                          scan_erl_form/1
nl/1                          scan_erl_form/2
parse_erl_exprs/1             scan_erl_form/3
parse_erl_exprs/2             scan_erl_form/4
parse_erl_exprs/4             setopts/1
parse_erl_exprs/3             setopts/2
parse_erl_form/1              write/1
                              write/2
:ok
iex(2)> :c.m(IO)
Module: 'Elixir.IO'
MD5: dc4dec8b90d6a33704f2de1e0ca9de41
Compiled: No compile time info available
Object file: /usr/local/Cellar/elixir/1.5.2/bin/../lib/elixir/ebin/Elixir.IO.beam
Compiler options:  []
Exports: 
'__info__'/1                  inspect/2
binread/1                     inspect/3
binread/2                     iodata_length/1
binstream/2                   iodata_to_binary/1
binwrite/1                    module_info/0
binwrite/2                    module_info/1
chardata_to_string/1          puts/1
each_binstream/2              puts/2
each_stream/2                 read/1
getn/1                        read/2
getn/2                        stream/2
getn/3                        warn/2
gets/1                        warn/1
gets/2                        write/1
inspect/1                     write/2
:ok
iex(3)> :c.m()
Module                File
Elixir.Access         /usr/local/Cellar/elixir/1.5.2/bin/../lib/elixir/ebin/Elixir.Access.beam
Elixir.Agent          /usr/local/Cellar/elixir/1.5.2/bin/../lib/elixir/ebin/Elixir.Agent.beam
Elixir.Agent.Server   /usr/local/Cellar/elixir/1.5.2/bin/../lib/elixir/ebin/Elixir.Agent.Server.beam
Elixir.Application    /usr/local/Cellar/elixir/1.5.2/bin/../lib/elixir/ebin/Elixir.Application.beam
Elixir.Code           /usr/local/Cellar/elixir/1.5.2/bin/../lib/elixir/ebin/Elixir.Code.beam
Elixir.Enum           /usr/local/Cellar/elixir/1.5.2/bin/../lib/elixir/ebin/Elixir.Enum.beam
Elixir.File           /usr/local/Cellar/elixir/1.5.2/bin/../lib/elixir/ebin/Elixir.File.beam
Elixir.GenServer      /usr/local/Cellar/elixir/1.5.2/bin/../lib/elixir/ebin/Elixir.GenServer.beam
Elixir.IEx            /usr/local/Cellar/elixir/1.5.2/bin/../lib/iex/ebin/Elixir.IEx.beam
...
Elixir.Supervisor     /usr/local/Cellar/elixir/1.5.2/bin/../lib/elixir/ebin/Elixir.Supervisor.beam
Elixir.Supervisor.De  /usr/local/Cellar/elixir/1.5.2/bin/../lib/elixir/ebin/Elixir.Supervisor.Default.beam
Elixir.System         /usr/local/Cellar/elixir/1.5.2/bin/../lib/elixir/ebin/Elixir.System.beam
application           /usr/local/lib/erlang/lib/kernel-5.4/ebin/application.beam 
application_controll  /usr/local/lib/erlang/lib/kernel-5.4/ebin/application_controller.beam
application_master    /usr/local/lib/erlang/lib/kernel-5.4/ebin/application_master.beam
binary                /usr/local/lib/erlang/lib/stdlib-3.4.2/ebin/binary.beam
c                     /usr/local/lib/erlang/lib/stdlib-3.4.2/ebin/c.beam
code                  /usr/local/lib/erlang/lib/kernel-5.4/ebin/code.beam
code_server           /usr/local/lib/erlang/lib/kernel-5.4/ebin/code_server.beam 
edlin                 /usr/local/lib/erlang/lib/stdlib-3.4.2/ebin/edlin.beam
elixir                /usr/local/Cellar/elixir/1.5.2/bin/../lib/elixir/ebin/elixir.beam
elixir_aliases        /usr/local/Cellar/elixir/1.5.2/bin/../lib/elixir/ebin/elixir_aliases.beam
elixir_code_server    /usr/local/Cellar/elixir/1.5.2/bin/../lib/elixir/ebin/elixir_code_server.beam
elixir_config         /usr/local/Cellar/elixir/1.5.2/bin/../lib/elixir/ebin/elixir_config.beam
elixir_dispatch       /usr/local/Cellar/elixir/1.5.2/bin/../lib/elixir/ebin/elixir_dispatch.beam
elixir_env            /usr/local/Cellar/elixir/1.5.2/bin/../lib/elixir/ebin/elixir_env.beam
elixir_erl            /usr/local/Cellar/elixir/1.5.2/bin/../lib/elixir/ebin/elixir_erl.beam
...
epp                   /usr/local/lib/erlang/lib/stdlib-3.4.2/ebin/epp.beam
erl_anno              /usr/local/lib/erlang/lib/stdlib-3.4.2/ebin/erl_anno.beam
erl_distribution      /usr/local/lib/erlang/lib/kernel-5.4/ebin/erl_distribution.beam
erl_eval              /usr/local/lib/erlang/lib/stdlib-3.4.2/ebin/erl_eval.beam
erl_internal          /usr/local/lib/erlang/lib/stdlib-3.4.2/ebin/erl_internal.beam
erl_lint              /usr/local/lib/erlang/lib/stdlib-3.4.2/ebin/erl_lint.beam
erl_parse             /usr/local/lib/erlang/lib/stdlib-3.4.2/ebin/erl_parse.beam 
erl_prim_loader       preloaded
erl_scan              /usr/local/lib/erlang/lib/stdlib-3.4.2/ebin/erl_scan.beam
erl_signal_handler    /usr/local/lib/erlang/lib/kernel-5.4/ebin/erl_signal_handler.beam
erl_tracer            preloaded
erlang                preloaded
error_handler         /usr/local/lib/erlang/lib/kernel-5.4/ebin/error_handler.beam
error_logger          /usr/local/lib/erlang/lib/kernel-5.4/ebin/error_logger.beam
error_logger_tty_h    /usr/local/lib/erlang/lib/stdlib-3.4.2/ebin/error_logger_tty_h.beam
erts_code_purger      preloaded
erts_dirty_process_c  preloaded
erts_internal         preloaded
erts_literal_area_co  preloaded
ets                   /usr/local/lib/erlang/lib/stdlib-3.4.2/ebin/ets.beam
eval_bits             /usr/local/lib/erlang/lib/stdlib-3.4.2/ebin/eval_bits.beam 
file                  /usr/local/lib/erlang/lib/kernel-5.4/ebin/file.beam
file_io_server        /usr/local/lib/erlang/lib/kernel-5.4/ebin/file_io_server.beam
file_server           /usr/local/lib/erlang/lib/kernel-5.4/ebin/file_server.beam 
filename              /usr/local/lib/erlang/lib/stdlib-3.4.2/ebin/filename.beam
...
:ok

There a a lot of modules loaded when elixir is running. The :c module contains all the Erlang shell built-in commands. Some more I can recommend are :c.i/0, :c.regs/0 and :c.help/0.

poops · May 3, 2018, 10:34pm

Well that IO.puts call happens right before calling a third-party API 42 times (1 for each order). So I think it’s unlikely that the script is finishing before IO.puts is run. Unless I’m misunderstanding what you’re saying.

Doesn’t it seem stranger that it’s calling an erlang function that doesn’t exist?

poops · May 3, 2018, 10:41pm

Thanks, this is super helpful! Although I still can’t figure out why IO.puts/1 is calling put_chars/3 which doesn’t seem to exist

OvermindDL1 · May 3, 2018, 10:59pm

Message ordering is only ensured between any 2 processes, once you start talking to more those others can interleave or halt or all kinds of other things.

I’m actually thinking it is apply-calling something based on a now-missing process, that would explain this missing function call.

NobbZ · May 4, 2018, 8:05am

put_chars/1/2 are in the end just wrappers around o_request/3, which in case of an error manipulates the stacktrace and injects a function call that never has happened.

Take a look at especiall line 66, where it drops the appearance of o_request/3 in the current stacktrace and then a line later injects something else.

github.com

erlang/otp/blob/f1e687312daa91591a2cee5038a2d9434e7db209/lib/stdlib/src/io.erl#L62-L70


      
          o_request(Io, Request, Func) ->
              case request(Io, Request) of
          	{error, Reason} ->
          	    [_Name | Args] = tuple_to_list(to_tuple(Request)),
          	    {'EXIT',{get_stacktrace,[_Current|Mfas]}} = (catch erlang:error(get_stacktrace)),
          	    erlang:raise(error, conv_reason(Func, Reason), [{io, Func, [Io | Args]}|Mfas]);
          	Other ->
          	    Other
              end.

poops · May 4, 2018, 3:08pm

I see, thanks for clearing that up! So in this case, I don’t want IO.puts/1 to raise if :standard_io is gone. I’m fine with it being ignored and moving on. Does this seem reasonable to put in a custom IO module?

  def puts(device \\ :standard_io, str) do
    :io.request(device, {:put_chars, :unicode, [str, ?\n]})
  end

NobbZ · May 4, 2018, 3:38pm

Well, io:request/2 is private, so that won’t work…

Anyway, it would just circumvent the mangling of the stack trace. But the registry would still complain about not beeing able to reach the process behind :standard_io.

Perhaps you can catch that one and surpress the error message that way.

I do think though, it were much cleaner not to try to read and write to stdin/stdout in an environment that has none…

OvermindDL1 · May 4, 2018, 3:55pm

It’s exceedingly hacky, but try adding a Process.sleep(4000) or something as the very last call in that rpc command, might delay it enough?

poops · May 4, 2018, 7:53pm

I see io:request/2 exported in that link you sent otp/lib/stdlib/src/io.erl at f1e687312daa91591a2cee5038a2d9434e7db209 · erlang/otp · GitHub. I was actually just trying to circumvent having io:o_request/3 raise, although I guess circumventing the stack trace mangling is a bonus?

I don’t disagree, but I was thinking it’s easier to just let it fail rather than checking if that process is still alive before writing to it.

NobbZ · May 5, 2018, 7:55am

Yes, exported, but still private. It is not documented in the io-manual, so I do consider it as not existing.

Trying to use :io.request/2 is even more hacky than the proposed Process.sleep/1, as it relies on an implementation detail and might break with any random update of Erlang.

poops · May 5, 2018, 9:51am

Ah I see, thanks for the insight. I think I’ll just wrap IO.puts/2 in a try/rescue. I have a number of crons this is happening with, so I’d rather just ignore the error than prolonging their run time with Process.sleep/1

Thanks to everyone for the help here! I’ve learned some new tricks from all of you!