Sorry if my question is silly, I am relatively new to Elixir.
I made a small module to handle the execution of external programs. This is my code:
defmodule Abn.Shell do
@default_timeout 2000
@default_retry 2
@default_error_log_file "/tmp/abn_errors/"
use Abn.Lib
alias Abn.Log
###########################################################################3
## Module API
def run(command, timeout \\ @default_timeout, retry \\ @default_retry) do
random_filename = if (@log_errors), do: :erlang.unique_integer([:positive]), else: ""
script = """
#!/usr/bin/env setsid /bin/bash
#{command}
"""
try do
port = Port.open({:spawn, script}, [:binary])
monitor = Port.monitor(port)
{:os_pid, ospid} = Port.info(port, :os_pid)
output = get_output({port, monitor}, timeout, ospid)
kill(ospid)
Port.demonitor(monitor, [:flush])
send(port, :close) # just in case...
if (output == :timeout and retry > 1) do
run(command, timeout, retry - 1)
else
output
end
rescue
e ->
Log.log(:error, "[SHELL]: #{inspect e}")
:error
end
end
###########################################################################3
## Private Tools
defp get_output({ port, monitor }, timeout, ospid, output \\ "") do
receive do
{:DOWN, ^monitor, :port, ^port, _} ->
output
{^port, {:data, data}} ->
get_output({ port, monitor }, timeout, ospid, output <> data)
msg ->
Log.log(:warning, "[SHELL]: Port #{inspect port}. Breaking loop, unknown 'get_output' message (#{inspect msg})")
output
after timeout ->
Log.log(:warning, "[SHELL]: Process exceed timeout, killing process...")
:timeout
end
end
defp kill(ospid) do
System.shell("kill -9 #{ospid}")
end
end
This module allows me to run external processes with a timeout (and n retries) for those cases in which the process takes too long.
The issue is that randomly, every so often I get the error ‘erl_child_setup: failed with error 32 on line 282’ which completely kills the process that invoked Shell.run and the supervisor of that process doesn’t even get an EXIT notification. I have tried to replicate the error but have not been able to.
The context in which this module is used is in a data collector that runs about 30 processes which simultaneously use the Shell module to run external processes.
Versions that I use
Erlang/OTP 27 [erts-15.1]
IEx 1.18.0-dev (a4adaa8) (compiled with Erlang/OTP 27)
Any idea where to start looking for the problem?