Edeliver won't start project at production

Hi all, I’m new to Elixir. I’m trying to use distillery (~> 0.10) + edeliver (~> 1.4.0) for a practice throwaway Phoenix app but it won’t start at production.

Manually running “bin/<project_name> start” at production works beautifully.

But most of the time “mix edeliver start production” does not. Occasionaly I get a blank response line but every so often, seemingly randomly, it will return “ok” and it works.

I found this error message in var/log/erlang.log.1

could not start kernel pid (application_controller) (error in config file “/home/holydoctrine/blah/blah/var/sys.config” (none): configuration file must contain ONE list ended by <dot>)

Any idea as to what is going on? Here is my config - it is barebones and vanilla.

/home/holydoctrine/blah/blah/var/sys.config:

%% Generated - edit/create /home/holydoctrine/blah/blah/sys.config instead.
[{sasl,[{errlog_type,error}]},
 {logger,
     [{console,
          [{format,<<"$time $metadata[$level] $message\n">>},
           {metadata,[request_id]}]},
      {level,info}]},
 {blah,
     [{'Elixir.Blah.Endpoint',
          [{url,[{host,<<"localhost">>}]},
           {render_errors,
               [{view,'Elixir.Blah.ErrorView'},
                {accepts,[<<"html">>,<<"json">>]}]},
           {pubsub,
               [{name,'Elixir.Blah.PubSub'},
                {adapter,'Elixir.Phoenix.PubSub.PG2'}]},
           {http,[{port,4000}]},
           {cache_static_manifest,<<"priv/static/manifest.json">>},
           {server,true},
           {root,<<".">>},
           {version,<<"0.0.1+20161209-11-037d63e">>},
           {secret_key_base,
               <<"S12G92uMyvs9gCXA437fGCejbXQ4nt53yuW2RvVQsIQPgNBci8TrmS4VZS/xZAGY">>}]}]}].

.deliver/config:

#!/usr/bin/env bash

APP="blah"

BUILD_HOST="localhost"
BUILD_USER="vagrant"
BUILD_AT="/tmp/edeliver/blah/builds"

PRODUCTION_HOSTS="<domain_name>"
PRODUCTION_USER="holydoctrine"
DELIVER_TO="/home/holydoctrine/blah"

AUTO_VERSION="build-date+git-commit-count-branch+git-revision"

pre_erlang_clean_compile() {
  status "Running phoenix.digest" # log output prepended with "----->"
  __sync_remote " # runs the commands on the build host
    [ -f ~/.profile ] && source ~/.profile # load profile (optional)
    set -e # fail if any command fails (recommended)
    cd '$BUILD_AT' # enter the build directory on the build host (required)
    # prepare something
    mkdir -p priv/static # required by the phoenix.digest task
    # run your custom task
    npm install
    ./node_modules/brunch/bin/brunch b -p
    APP='$APP' MIX_ENV='$TARGET_MIX_ENV' $MIX_CMD phoenix.digest $SILENCE
  "
}

.rel/config.exs:

use Mix.Releases.Config,
    default_release: :default,
    default_environment: :dev

environment :dev do
  set dev_mode: true
  set include_erts: false
  set cookie: :"7-zZM,KCNUs0NgZ?!oZE1|E`]lZ[<^(B*t4{AvKE4w[H3gm:8D[+l7mT(S.qM5tc"
end

environment :prod do
  set include_erts: true
  set include_src: false
  set cookie: :"7-zZM,KCNUs0NgZ?!oZE1|E`]lZ[<^(B*t4{AvKE4w[H3gm:8D[+l7mT(S.qM5tc"
end

release :blah do
  set version: current_version(:blah)
end
1 Like

I’ve also found that it takes a few attempts to start staging via edeliver after deploying a new release. I haven’t tried using bin start yet.

Which distribution are you running? Ubuntu 16.10 here.

I don’t know if this is relevant, but this rabbitmq issue shows that the error message about the sys.config file can be seen due to issues other than the file content syntax being incorrect: https://bugs.launchpad.net/ubuntu/+source/rabbitmq-server/+bug/653405

I’m using Ubuntu 14.04 LTS 64bit guest vm inside VirtualBox running on macOS host.

It’s just so strange that it works flawlessly with something like:

ssh username@host ‘cd ~/app_name/bin && ./app_name start’

I could edit releases//sys.config to nothing but “[].” and it would still complain about "configuration file must contain ONE list ended by " with edeliver.

I can’t think of anything else that could be the source of the problem. I temporarily turned off the production firewall because it was throttling ssh connections. No good.

Another area we can look at is using --debug like:

mix edeliver start production --debug

but my bash skills are lacking for reading that output.

I noticed it does alot of repeating:

  • kill -0 15344
  • kill -0 15343
  • sleep 0.5

which seem to be pids of dev machine processes:

bash deps/edeliver/bin/edeliver start production --debug --runs-as-mix-task
sh -c deps/edeliver/bin/edeliver start production --debug --runs-as-mix-task

I was able to increase the sleep value but that didn’t help.

Any idea what kill -0 is?

sending the signal 0 to a given PID just checks if any process with the given PID is running and you have the permission to send a signal to it.
http://stackoverflow.com/a/11012755/3373872

Thank you sztosz.

Same problem here:

=erl_crash_dump:0.3
Sat Dec 17 08:38:20 2016
Slogan: could not start kernel pid (application_controller) (error in config file "/opt/my_app/var/sys.config" (none): configuration file must contain ONE list ended by <dot>)

Anyone found a solution, yet?

Same issue here. Behavior is as @BrightEyesDavid described: the server will eventually start successfully after multiple attempts.

Ubuntu 16.04, edeliver and distillery. I’ve experienced the issue consistently with with both a Phoenix application and a non-Phoenix application.

Any ideas, @bitwalker? Please let me know if I should provide any other information.

We’re seeing the same thing.
At first our theory was that it was related to REPLACE_OS_VARS, but now we seem to be able to reproduce it without that set.
We can’t find a reliable reproduction case, in fact it usually works fine. So if any of you can reproduce it more reliably, I suggest opening an issue on the Distillery github repo. It may be more likely to get on @bitwalker 's radar, and also be more findable for other folks having the same issue.

It seems to be a timing issue around when the shell script gets run in the releases directory. There is a script
releases/0.0.1/app_name.sh
In that script it recreates the sys.config under the var directory, it puts a header in the file first and then copies the rest of sys.config over, if the application is booting when it truncates the file then you get a crash dump. The line:
echo “%% Generated - edit/create $RELEASE_CONFIG_DIR/sys.config instead.”
truncates the file

# Set SYS_CONFIG_PATH, the path to the sys.config file to use
# Use $RELEASE_CONFIG_DIR/sys.config if exists, otherwise releases/VSN/sys.config
if [ -z "$SYS_CONFIG_PATH" ]; then
    if [ -f "$RELEASE_CONFIG_DIR/sys.config" ]; then
        SYS_CONFIG_PATH="$RELEASE_CONFIG_DIR/sys.config"
    else
        SYS_CONFIG_PATH="$REL_DIR/sys.config"
    fi
fi
if [ "$SYS_CONFIG_PATH" != "$RELEASE_MUTABLE_DIR/sys.config" ]; then
    echo "%% Generated - edit/create $RELEASE_CONFIG_DIR/sys.config instead." \
        >  "$RELEASE_MUTABLE_DIR/sys.config"
    cat  "$SYS_CONFIG_PATH"                              \
        >> "$RELEASE_MUTABLE_DIR/sys.config"
    SYS_CONFIG_PATH=$RELEASE_MUTABLE_DIR/sys.config
fi

I haven’t found a work around yet, except to start from the command line

1 Like

I found this issue happens on low-spec machines.
Just give it a boot time tolerance should be fine.
https://github.com/boldpoker/edeliver/pull/199

1 Like

@otaq, sounds good - thanks! Though the machine I had the issue with isn’t particular low spec, but perhaps this will solve it anyway. I’ve since set up a 1GB Linode (their smallest package) with Ubuntu 16.04 - as opposed to 14.04 where I saw the issue - and I’m not seeing the issue there so far.

1 Like