How to best debug binary mix releases

_toni · April 19, 2020, 10:46am

Hello,

I’m trying to push mix release-generated binaries to my VPS server.

The same binaries run just fine with MIX_END=prod in my local machine.

However unfortunately once I push the tar.gz to the server and run it. It silently fails in that machine.

Notes:

Same behavior of server stopping silently also happens when I build the release in the server
My local machine & server run the same OS and architecture.

$ ./bin/myapo start
10:26:18.803 [info] Running MyappWeb.Endpoint with cowboy 2.7.0 at :::4000 (http)
10:26:18.807 [info] Access MyappWeb.Endpoint at http://localhost

I must be missing something rather basic but all ENV variables seem to be there and the only non-default phoenix dependency is bamboo mailer.
I tried with a vanilla phoenix repo release and it works just fine in that same VPS pointing to the same DB.

How can one debug the binary? Or maybe some of the info at erl_crash.dump can help?
I tried to set the following without luck

#config/prod.exs
config :logger, level: :debug

Thx!

quatermain · April 19, 2020, 11:25am

Hi, maybe there can be problem with release. It is strongly suggested to generate on same system. So that can be problem when you generate release on your Mac/Windows/Whatever and your server is based on other system. So I suggest to use docker on your localhost to make release. So you can execute mix release on same system as your server.

I can not find better article with tutorial how you can generate tar in docker but it’s not very hard. Same what you do but in Docker and with shared volume. Something like this

_toni · April 19, 2020, 1:05pm

@quatermain great point, thx!

I forgot to mention two things:

My local system’s OS/version matches the server.
The same behavior (server silently stops) when I build the release via mix release in the server machine.

NobbZ · April 19, 2020, 1:27pm

When you say it silently stops, what is its exit code? Are you sure, that errors aren’t written to a file through your logging configuration?

_toni · April 19, 2020, 1:56pm

It’s literally looking like this in the command-line. The only written file is erl_crach.dump but I can’t seem to dig any useful info from there.

$ ./bin/myapp start 10:26:18.803 
[info] Running MyappWeb.Endpoint with cowboy 2.7.0 at :::4000 (http) 10:26:18.807 
[info] Access MyappWeb.Endpoint at http://localhost
$

How would you read the exit code? Or what other log file can be potentially written?

NobbZ · April 19, 2020, 3:12pm

Where log files might be written is configured by whatever means you configure your application.

Exit code of the last run command can be read via echo $?.

quatermain · April 19, 2020, 5:07pm

Does it have installed all required dependencies?

Is it working when you want to run release bin files on your local machine?

Try to install AppSignal. Maybe it will start and report some issue. (Small chance but…)

_toni · April 19, 2020, 5:15pm

Ok @NobbZ and @quatermain thank you for your help and ideas.

It seems it’s been solved.
The following is my best diagnostic as far as I can tell

VPS was initializing the binary via systemd and the .service file looked like

#/etc/systemd/system/myapp.service
[Unit]
Description=Runner for myapp
After=network.target

[Service]
User=username
Group=username
Environment=LANG=en_US.UTF-8
Environment=MIX_ENV=prod
WorkingDirectory=/home/deploy/myapp/current
ExecStart=/home/deploy/myapp/current/bin/myapp daemon
ExecStop=/home/deploy/myapp/current/bin/myapp stop
ExecReload=/home/deploy/myapp/current/bin/myapp restart
SyslogIdentifier=myapp
RemainAfterExit=no
Restart=always
RestartSec=5
StartLimitBurst=3
StartLimitInterval=10

[Install]
WantedBy=multi-user.target

Unfortunately that service config lacked some ENV variables hence it failed to start successfully.
Hence this caused the job to keep restarting all the time so whenever I tried to start another new binary manually I’d somehow(?) detect an underlying running instance and would silently fail.

As soon as I fixed myapp.service service and it fires up correctly, the server is back up and running

I reproduced the error like this:

Start myapp.service correctly
Start manually the binary ./bin/myapp start while service is running
–> This produces the previously reported “silent” failing

Note whenever I try to do step 2) it sometimes fails with an informative message( I hadn’t encountered this before), but other times it just seems to start ok and fail silently.

Protocol 'inet_tcp': the name myapp@hostmachine seems to be in use by another Erlang node

One useful command that helped me debug somewhat was $ journalctl -u myapp.service -f, it displays the log of the service real-time

And whenever it silently fails it looks like

Apr 19 17:09:24 hostmachine myapp[5043]: --rpc-eval : RPC failed with reason :nodedown

Still I’d love some more tooling around debugging releases for the future if you know of any.

Hope the diagnostics are clear to you, thx!

_toni · April 19, 2020, 5:20pm

Thanks for the ideas @quatermain! I believe AppSignal as you say has probably at most the same potential insight into the app as the current barebones VPS i’m using, but might be worth a try

Also I like Docker and co. but i’d like to remain minimalistic until I really feel I need Docker. Feels more comfy to work directly on the VPS for now.

Dependencies and local run was working fine - and I believe I understand now why based on my reply above

quatermain · April 19, 2020, 6:53pm

@_toni you are welcome. I reply on your comment from email so I didn’t see others answers.

I suggested AppSignal because it starts with your application and it could rescue your error and notify about it. But it’s not gun for everything. I suggested Docker just for creating/building release on same env, not to use it for running env. I use it for maybe 10-15 elixir app also for running env but I don’t suggest it at all

And I think we should ask you how you start application so we would give you better advice maybe.

Good luck