I have a poolboy pool of workers. What they do is when the worker GenServer is started it creates through erlport new python server. Then when worker is asked to do a job i takes an URL, download file, extract some data from it and, passes this data to python for parsing. But even though i have a ulimit -n
set to 65536
and this in vm.args
## Node name
-name <%= Application.get_env(:morty, :node_name) %>
## Node cookie, used for distribution
-setcookie <%= Application.get_env(:morty, :cookie) %>
## Enable kernel poll
+K true
65536-env ERL_MAX_PORTS
And I checked in fact with :erlang.system_info(:port_limit)
to make sure that it returns 65536
I get constant errors like those below.
10:50:18.875 [error] Task #PID<0.6277.0> started from :"xxx_server" terminating
** (stop) exited in: :gen_server.call(:worker, {:checkout, #Reference<0.3831647178.621281284.180162>, true}, :infinity)
** (EXIT) an exception was raised:
** (MatchError) no match of right hand side value: {:error, {:bad_return_value, {:error, {:emfile, [{:erlang, :open_port, [{:spawn, '/home/app/.virtualenvs/morty_python/bin/python -V'}, [{:line, 80}, :stderr_to_stdout, :hide]], []}, {:erlport_options, :get_version, 1, [file: '/app/deps/erlport/src/erlport_options.erl', line: 227]}, {:python_options, :check_python_version, 1, [file: '/app/deps/erlport/src/python_options.erl', line: 177]}, {:python_options, :find_python, 1, [file: '/app/deps/erlport/src/python_options.erl', line: 161]}, {:python_options, :get_python, 1, [file: '/app/deps/erlport/src/python_options.erl', line: 146]}, {:python_options, :parse, 2, [file: '/app/deps/erlport/src/python_options.erl', line: 88]}, {:python, :start, 3, [file: '/app/deps/erlport/src/python.erl', line: 168]}, {:gen_server, :init_it, 2, [file: 'gen_server.erl', line: 374]}]}}}}
(poolboy) src/poolboy.erl:275: :poolboy.new_worker/1
(poolboy) src/poolboy.erl:280: :poolboy.new_worker/2
(poolboy) src/poolboy.erl:192: :poolboy.handle_call/3
(stdlib) gen_server.erl:661: :gen_server.try_handle_call/4
(stdlib) gen_server.erl:690: :gen_server.handle_msg/6
(stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
:erlang.send(:worker, {:"$gen_cast", {:cancel_waiting, #Reference<0.3831647178.621281284.180162>}})
(stdlib) gen_server.erl:445: :gen_server.do_send/2
(stdlib) gen_server.erl:243: :gen_server.do_cast/2
(poolboy) src/poolboy.erl:58: :poolboy.checkout/3
(poolboy) src/poolboy.erl:74: :poolboy.transaction/3
(elixir) lib/task/supervised.ex:89: Task.Supervised.do_apply/2
(elixir) lib/task/supervised.ex:38: Task.Supervised.reply/5
(stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
10:50:19.768 [error] Task #PID<0.6361.0> started from :"yyy_server" terminating
** (stop) exited in: :gen_server.call(:worker, {:checkout, #Reference<0.3831647178.621281281.204659>, true}, :infinity)
** (EXIT) an exception was raised:
** (MatchError) no match of right hand side value: {:error, {:bad_return_value, {:error, {:bad_return_value, {:error, {:open_port_error, :emfile}}}}}}
(poolboy) src/poolboy.erl:275: :poolboy.new_worker/1
(poolboy) src/poolboy.erl:296: :poolboy.prepopulate/3
(poolboy) src/poolboy.erl:145: :poolboy.init/3
(stdlib) gen_server.erl:374: :gen_server.init_it/2
(stdlib) gen_server.erl:342: :gen_server.init_it/6
(stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
(stdlib) gen_server.erl:223: :gen_server.call/3
(poolboy) src/poolboy.erl:55: :poolboy.checkout/3
(poolboy) src/poolboy.erl:74: :poolboy.transaction/3
(elixir) lib/task/supervised.ex:89: Task.Supervised.do_apply/2
(elixir) lib/task/supervised.ex:38: Task.Supervised.reply/5
(stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
And various ecto disconnection issues.
When I set poolboy config to this:
[{:name, {:local, :worker}}, {:worker_module, Morty.CrawlWorker}, {:size, 64}, {:max_overflow, 8}]
But if I change size to just 32 it starts to work OK, no more emfiles.
Where to look for the real cause of problem?