Elixir apps as systemd services - info & wiki

yurko · November 15, 2016, 4:09pm

Here are few pieces of (common) Linux knowledge that we use for reasonably small one server apps. We use Ubuntu but this should work for any Debian derivative and shouldn’t need much tweaking for other distros.

Example systemd service

[Unit]
Description=My app daemon

[Service]
Type=simple
User=username
Group=groupname
Restart=on-failure
Environment=MIX_ENV=prod "PORT=4000"
Environment=LANG=en_US.UTF-8

WorkingDirectory=/var/apps/myapp
ExecStart=/usr/local/bin/mix phoenix.server

[Install]
WantedBy=multi-user.target

must be put it in /lib/systemd/system/myapp.service, also note the use of absolute paths and extra verbosity for utf8 support. Control it using systemctl, e.g. sudo systemctl status myapp.service, sudo systemctl restart myapp.service etc.

After setting up the service must be enabled by running systemctl enable myapp.service, this must only be done once so that the system creates symlinks to the service file. Without this step everything will work fine but will not be restarted on boot.

For much more feature rich description s. systemd.service

1.9 and up: using releases

Since 1.9 there is no reason not to use releases, the transition from mentioned service is also pretty easy. In my experience simple service type works fine (and since it’s a recommended way unless something special needed I use it).

An extra step required when deploying - to build a release. The simplest possible case would be to do it on the same server, something like MIX_ENV=prod PORT=80 mix release release-name --overwrite. S. mix release — Mix v1.13.3 for info about release command and release configs.

The resulting systemd service is not much different:

[Unit]
Description=My app daemon
 
[Service]
Type=simple
User=username
Group=groupname
Restart=on-failure
Environment=MIX_ENV=prod
Environment=LANG=en_US.UTF-8
Environment=PORT=80
 
WorkingDirectory=/var/apps/myapp

ExecStart=/var/apps/myapp/_build/prod/rel/release-name/bin/live start
ExecStop=/var/apps/myapp/_build/prod/rel/release-name/bin/live stop

[Install]
WantedBy=multi-user.target

Side note: To be able to use port 80 as in the above example without running as root CAP_NET_BIND_SERVICE Linux capability can be used on packaged runtime, here is a command that achieves it (version number may change so after upgrading Erlang this command must be run again with new path): sudo setcap 'cap_net_bind_service=+ep' /var/apps/myapp/_build/prod/rel/release-name/erts-10.4.3/bin/beam.smp more info about Linux capabilities capabilities(7): overview of capabilities - Linux man page

UPD: Instead of setting capabilities on executable level it is also possible to do that on service level, this way it would not care about runtime version. It is done with following line in the service file: CapabilityBoundingSet=CAP_NET_BIND_SERVICE. There is a quirk though (as it often the case with systemd): this would not work unless ambient capabilities are set as well, so the whole following part is needed:

# Add capability to be able to bind on port 80, 
# doing it here means we don't care about runtime version and location
CapabilityBoundingSet=CAP_NET_BIND_SERVICE
AmbientCapabilities=CAP_NET_BIND_SERVICE

View logs

Default Phoenix behavior (logging to standard output) plays well with systemd, so you can use journalctl to manage logs, few examples:

journalctl -u myapp.service --since today
journalctl -u myapp.service --since 09:00 --until "1 hour ago"
journalctl -u myapp.service --since "2016-11-10 12:00" --until "2016-11-10 13:00"

s. journalctl for more.

Restart on deploy

To allow for automatic restart of the service, e.g. as part of automatic deployment adjust the sudoers file by running sudo visudo and add something like that username ALL = NOPASSWD: /bin/systemctl restart myapp.service so that the user with the name username can restart the service without entering root password

“Old school” logs

Write data to a file (or use off the shelf solution like GitHub - onkel-dirtus/logger_file_backend) and manage logs using Linux lorgotate, here is an example config

/var/apps/myapp/log/*.log {
daily
missingok
rotate 18
compress
delaycompress
notifempty
create 660 username groupname    
dateext
dateformat -%Y-%m-%d-%s    
su username groupname
}

must be put in /etc/logrotate.d/myapp, dry run / debug sudo logrotate -d /etc/logrotate.d/myapp, force rotation (for testing) sudo logrotate --force /etc/logrotate.d/myapp. More on logrotate http://www.linuxcommand.org/man_pages/logrotate8.html

Example upstart service

Relevant if using Ubuntu before 16.04 (not sure how it goes between LTS’), this example uses exrm releases:

description "My app daemon"

setuid username
setgid groupname

start on runlevel [2345]
stop on runlevel [016]

expect stop
respawn

env MIX_ENV=prod
export MIX_ENV

env PORT=4000
export PORT

env HOME=/var/apps/myapp
export HOME


pre-start exec /bin/sh /var/apps/myapp/bin/myapp start

post-stop exec /bin/sh /var/apps/myapp/bin/myapp stop

Example deployment script

To run automatically after deployment for rolling update of a phoenix project with no DB (hence no ecto migrate) in staging environment (hence tests):

cd /var/apps/myapp
MIX_ENV=prod mix deps.get
brunch build --production
MIX_ENV=prod mix phoenix.digest
MIX_ENV=prod mix compile
sudo service myapp restart    
mix test

can be triggered by a commit into a specific branch or using web UI, if anything goes wrong the whole thing exits with a non zero status and lets you know about the problem. Makes a nice simple alternative to CI.

lessless · June 6, 2017, 10:19am

Type=simple might not be a correct type https://www.lucas-nussbaum.net/blog/?p=877

yurko · June 6, 2017, 11:06am

Do you get any issues cause of the type? I don’t think I do, a startup error (as far as I remember) triggers visible error, starting and restarting the service works fine except erlang sees it as crush cause of the child killing business

lessless · June 6, 2017, 3:26pm

I had an issue with node restart on an application crash and overcame it with RestartSec=5 but Type=forking is a cleaner solution.

yurko · June 6, 2017, 3:41pm

Do you have a well tested service with forking type? If not I’ll give it a try next time I have to set up an environment.

lessless · June 6, 2017, 3:56pm

We switched to it couple weeks ago and so far the flight is normal

yurko · June 6, 2017, 4:32pm

care to post the config?

BrightEyesDavid · July 6, 2017, 11:54pm

I’m using releases, systemd and a service type of ‘forking’. I haven’t tested restarting on crash yet. (What would be a good way of doing that?)

My unit file:

[Unit]
Description=My App
After=network.target
Requires=network.target

[Service]
Type=forking
WorkingDirectory=/home/appuser/app
User=appuser
Group=appuser
Restart=on-failure
RestartSec=5
EnvironmentFile=/home/appuser/env_vars
ExecStart=/bin/bash -c 'PATH=/home/appuser/.asdf/shims:$PATH exec /home/appuser/app/bin/my_app start'
ExecStop=/bin/bash -c 'PATH=/home/appuser/.asdf/shims:$PATH exec /home/appuser/app/bin/my_app stop'

[Install]
WantedBy=multi-user.target

(I’ve installed erlang on the server via asdf so that my releases don’t have to include_erts, and found that I needed to add its shims to PATH.)

Is this “child killing business” why systemd reports a failed state on stop? Is there any way round that?

Relevant commit and discussion at Distillery:

github.com/bitwalker/distillery

Update Use With systemd.md (#259)

committed 04:35PM - 26 Apr 17 UTC

fertel

+1 -1

Tested with release. Setting Type=forking causes app to restart correctly on fai…lure. Per the docs: If set to forking, it is expected that the process configured with ExecStart= will call fork() as part of its start-up. The parent process is expected to exit when start-up is complete and all communication channels are set up. The child continues to run as the main daemon process. This is the behavior of traditional UNIX daemons. If this setting is used, it is recommended to also use the PIDFile= option, so that systemd can identify the main process of the daemon. systemd will proceed with starting follow-up units as soon as the parent process exits.

github.com/bitwalker/distillery

change systemd.md remove RemainAfterExit=yes replace with Type=forking

bitwalker:master ← fertel:patch-1

opened 04:01PM - 26 Apr 17 UTC

fertel

+1 -1

Tested with release. Setting Type=forking causes app to restart correctly on fai…lure. Per the docs: If set to forking, it is expected that the process configured with ExecStart= will call fork() as part of its start-up. The parent process is expected to exit when start-up is complete and all communication channels are set up. The child continues to run as the main daemon process. This is the behavior of traditional UNIX daemons. If this setting is used, it is recommended to also use the PIDFile= option, so that systemd can identify the main process of the daemon. systemd will proceed with starting follow-up units as soon as the parent process exits.

yurko · July 7, 2017, 9:34am

you can try killing the process so that it gets restarted

I think it is. There might be a cleaner way to stop the vm, from your config it seems both changing the type to forking and the nice ExecStop do not work. I didn’t get to that yet, please let me know if you find the settings that kill the vm softly

The comment from bitwalker in the linked issue seems to make sense for both releases and mix services: “I suspect that maybe it’s difficult for systemd to understand which pid is the actual pid it should care about - guessing the main pid when starting is probably pretty easy, but stopping is maybe getting tripped up because even though the daemon is stopped, the shell process is still executing the post_stop hooks afterward.”

yurko · July 5, 2019, 1:52pm

At some point I came to this issue and I guess the type “exec” allows for a nicer swutdowns when using mix.

Here is the log from mix [info] SIGTERM received - shutting down.

Now with 1.9 and built in releases they might be the way to go about deployments though.

jihantoro · August 9, 2019, 10:45pm

any update ?

how i can easily run phoenix app via systemd ?

like, i was getting permission problem, missing something at second run, it was 1 month ago btw, and now i just run my phoenix app with screen and MIX_ENV=prod mix phx.server

yurko · August 10, 2019, 8:24am

Works fine with type simple and releases, seems easy enough. S. " 1.9 and up: using releases" in the above article.

jihantoro · August 11, 2019, 11:44pm

WORKING STEP TO DEPLOY PHOENIX WITH SYSTEMD

make sure
Can't run phoenix with elixir 1.9.0 release
do mix release, for example
SECRET_KEY_BASE='MBwDNPWb9nw5K/Cm/QJ62dgKeU7OM76hH7hVz9HMo7f2fasurhNqGNaMsFh3Ll' DATABASE_URL='ecto://user:password@host/database' PORT=4002 MIX_ENV=prod mix release
Change this systemd config file

[Unit]
Description=My Phoenix App
After=network.target

[Install]
WantedBy=multi-user.target

[Service]
Environment="HOME=/var/app/my-phoenix-app"
ExecStart=/var/app/my-phoenix-app/_build/prod/rel/my-phoenix-app/bin/my-phoenix-app start
ExecStop=/var/app/my-phoenix-app/_build/prod/rel/my-phoenix-app/bin/my-phoenix-app stop
SyslogIdentifier=simple
Restart=always

# 'StartLimitInterval' must be greater than 'RestartSec * StartLimitBurst' otherwise the service will be restarted indefinitely.
# https://serverfault.com/a/800631
RestartSec=5
StartLimitBurst=3
StartLimitInterval=10

kif · July 24, 2022, 8:42am

@yurko thank you for making this post! This is great info. I’m wondering if you get stack trace logs when exception happen in your app? I’m having trouble getting that errors with journalctl -u <app>. Linking my original post here.

yurko · July 25, 2022, 8:19am

@kif that’s an interesting issue, I have checked a production app that runs this way and can confirm that I have stack trace, the error is one that is raised manually with raise and not one that happened “on itself”.

We use exception tracking via AppSignal so I must have missed the issue with consistency between real exceptions and what we get in logs.

I also searched for a more “natural” error and have not found one, though I did find ActionClauseError of Phoenix which was reported properly in the monitoring (stack trace included), but in the log I have only seen the “Sent 400 in 6ms”, so that might be your case: Phoenix handles the error in some fixed way which makes it normal case and not an exception which loses the error info for the resulting output.

I would assume you have the same situation, you can test it by adding a manual raise call and see what you get.

In any case I am pretty sure the issue has nothing to do with OS but rather with application environment and configuration.

kif · July 26, 2022, 8:28pm

Thank you for the reply!

You right, it has nothing to do with with systemd. I posted some update to the issue.

What you’re saying is interesting. “Phoenix handles the error in some fixed way which makes it normal case and not an exception”
Why do you think this is implemented this way? Isn’t that crucial to get the root cause of failure in production log with nice stack trace and not just “Sent 400 in 6ms”?

yurko · July 27, 2022, 8:35am

Isn’t that crucial to get the root cause of failure in production log with nice stack trace and not just “Sent 400 in 6ms”?

Well, as I mentioned - I see that info in the monitoring system, so it depends on your error handling - it’s not lost, just unused basically. Even if you don’t handle errors explicitly, you still do using defaults and if these defaults don’t work for you you can change them.

It’s about what your app considers important / exceptional and what is normal (for your app). Say a request to a non existing route or anything similar that is converted to a 4xx falls into the “not exceptional” category so you get no stack trace and if you raise an exception manually you will see it (probably, I did - see above).

If you want to know exactly what is happening and why, you can look at the source and related issues, I was not able to pinpoint the current state and how we got there quickly enough, but here are few issues to get you started:

Error handling in Phoenix · Issue #482 · phoenixframework/phoenix · GitHub that’s more about initial reasoning
Don't log an ErlangError if we're handling the error · Issue #340 · phoenixframework/phoenix · GitHub this one is closer to your issue
The change seems to remove the logic which kind of did the “view hack”, just not in the view and not in a hacky way Do not log errors if handled by user. Close #340 · phoenixframework/phoenix@701628b · GitHub

PS since it’s not related to OS in any way, we got a bit off topic here.

kif · July 27, 2022, 10:04am

What monitoring system are you using? I’m basically using the same systemd setup that you outlined in this topic.

I looked at you links. There are good bits of information in there. It would be great if I have more experience with Elixir for comprehending the source code. I started to learn Elixir and Phoenix 2 month ago.

Thank you for clarification on my questions. Yes we went a bit of topic.

yurko · July 27, 2022, 10:44am

What monitoring system are you using?

We use AppSignal Application Monitoring for Elixir applications | AppSignal APM and there I also see stack traces for errors that are not in my log, at least in case of mentioned ActionClauseError.

blackham · February 11, 2023, 8:42pm

I used mix in systemd to launch phx apps for a couple years. But it’s time to replace the simple service with a forked daemon. Here is where I’ll save my notes so when I have to do it again in 3 months and I google “elixir systemd service” I’ll find them again. Oh, and I guess if it helps someone else out, oh well. That’s the price I pay to be lazy.

Server: Ubuntu 20.04.5 LTS
Version: elixir 1.14.3 (compiled with Erlang/OTP 25) - with asdf
username: core (replace as you need)
project name: court (or court_api, replace as you need)

Build the release version in /opt/court_api

MIX_ENV=prod mix compile
MIX_ENV=prod mix assets.deploy
mix phx.gen.release
mix release

I like to load my entire environment from file rather than set each value in systemd’s config. These settings are all basic stuff except the PHX_SERVER=true is needed to tell the elixir daemon to launch the phoenix server. Also since I’m using asdf I include it’s shims and bin folder in the PATH.

vi /etc/environment-court

PATH="/home/core/.asdf/shims:/home/core/.asdf/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
SECRET_KEY_BASE=RlfalO4gogetyourownkeyzdNsLE
MIX_ENV=prod
DATABASE_URL=ecto://mydbuser:mypassword@192.168.1.257/court
PORT=5428
PHX_SERVER=true

Now setup the systemd service. Note I’m running it as daemon and not start. This forks it in the background or something like that. The takeaway is it is now running in the background. (If I was in Bash, I would get my prompt back)

vi /lib/systemd/system/court.service

[Unit]
Description=Court
After=network.target
Requires=network.target

[Service]
Type=forking
WorkingDirectory=/opt/court_api
User=core
Group=core
Restart=always
RestartSec=5
EnvironmentFile=/etc/environment-court
ExecStart=/opt/court_api/_build/prod/rel/court_api/bin/court_api daemon
ExecStop=/opt/court_api/_build/prod/rel/court_api/bin/court_api stop

[Install]
WantedBy=multi-user.target

Now restart systemd and start the new court service. I use restart because lets be honest, I’ll typo something or want to add RuntimeMaxSec.

systemctl daemon-reload
systemctl restart court

NOTES - Logs
Bad news is now that phx is running in the background, stdout isn’t caught by systemd. TODO: Look for or add a phx setting that will route stdout to syslog. Until that day, find your logs here:

tail -f /opt/court_api/_build/prod/rel/court_api/tmp/log/erlang.log.1

Not sure if elixir will rotate that or not. If not then something like this (not tested)

vi /etc/logrotate.d/elixir-court

/opt/court_api/_build/prod/rel/court_api/tmp/log/*log* {
  daily
  missingok
  rotate 14
  compress
  delaycompress
  notifempty
  create 640 core core
  sharedscripts  
}

NOTES - console
Losing journalctl logs sucks the big one, but this makes up for it! The ability to jump into the console and break things directly!

/opt/court_api/_build/prod/rel/court_api/bin/court_api remote

I hope I find these notes useful in 3 months when I have to do this again. (I hope someone replies back with the news that elixir/phx now detects if it has been launched in daemon mode and redirects stdout to syslogd)