Elixir 1.9 release slow boot time

odarriba · July 22, 2019, 11:50am

Hello,

We are working in a pretty big Elixir application (at the moment of writing, 1 minute of compilation in an i5 processor and ~930 modules) and we have recently moved from Distillery to new Mix releases.

The issue we are having is that we are experiencing a 300% extra boot time in the application, which is causing some problems on our deployments (but we can fix that, the deployment is not the issue).

We have made a comparision of booting times with a release made by Distillery and a release made by new Mix release command, and we are seeing:

Distillery: ~2 seconds until Phoenix is up, inluding checking for pending migrations
Mix: ~9 seconds until Phoenix is up, without any migration-related process.

both measurements are on the same machine (which is the same I have made the build, so no cross-platform issue)

I have tries removing all runtime configurations, which make irt a bit better (1 or 2 seconds less) but not even near Distillery times.

Also, I noticed that during Mix release boot there is only one BEAM process taking 150% of CPU for that ~9 seconds, when with Distillery one dont get to that usage (or it goes so faster that top doesn’t measure it).

Our mix.exs release config:

  def project do
    [
      ...
      releases: [
        my_release: [
          include_executables_for: [:unix],
          path: "rel/build/my_release"
        ]
      ],
      default_release: :my_release,
      ...
    ]
  end

vm.args.eex is like the default one and the env.sh.eex just set the distribution to name and sets the node name according to the pod IP when deployed on kubernetes (that was also made on Distillery releases).

Note that we are doing all this benchmarks on local, not even deploying to Kubernetes (which quotas configuratiuons can also affect).

Any idea what is happening?

Fl4m3Ph03n1x · July 23, 2019, 8:19am

I don’t exactly know what is happening, but I do remember that the latest OTP / Erlang releases compiled around 30% faster (can’t remember if the came in version 21 or 22).

Which OTP/Erlang version are you compiling against? have you tried the newest ones?

odarriba · July 23, 2019, 8:23am

Yep, we are using Erlang 22 with Elixir 1.9.0 (we also tried 1.9.1)

In development we are using:

elixir 1.9.0-otp-22
erlang 22.0.5

and in production we build using official elixir image.

However, the compilation times are the expected ones (more or less, it takes time but it’s due to cross dependencies between modules), the issue is when we start the release itself, that produces a CPU spike and a quite long time until the first request is served, which doesn’t happen with Distillery.

tcoopman · September 10, 2019, 10:03am

I have the idea that I’m running into the same issue with one of my applications.

I’m using elixir 1.9.1-otp-21.
At the startup of the compiled release (with bin/my_app start) I see a spike in CPU usage and it takes at least 5 seconds before any application code is executed. Running env MIX_ENV=prod mix phx.server starts immediately.

Has anyone else seen this?

jola · September 10, 2019, 12:31pm

Releases should start slower than mix phx.server because they preload modules. Mix will load modules lazily, so it starts instantly but you get a slower application until all modules have been loaded instead. Because releases preload modules they start slower but are then “warm”. Releases also start twice, once to load runtime configuration and the second time for real.

It’s hard to know in your case if mix releases are slower than distillery, unless you have a comparison?

tcoopman · September 10, 2019, 1:17pm

Ok, so I’ve created a bunch of test cases and measured the time to entry in Application.start:

## Distillery (elixir 1.8.2-otp-21)
- date && _build/prod/rel/app/bin/app foreground: < 3 seconds
- date && _build/prod/rel/app/bin/app console: < 3 seconds

## Distillery (elixir 1.9.1-otp-21)
- date && _build/prod/rel/app/bin/app foreground: < 3 seconds
- date && _build/prod/rel/app/bin/app console: < 3 seconds

## Elixir releases (elixir 1.9.1-otp-21)
- date && _build/prod/rel/app/bin/app start: 10 seconds
- date && _build/prod/rel/app/bin/app start_iex: 10 seconds

## env MIX_ENV=prod mix phx.server --preload-modules (elixir 1.9.1-otp-21)
- date && env MIX_ENV=prod mix phx.server --preload-modules: < 3 seconds

## Experiment: Elixir releases without runtime configuration (elixir 1.9.1-otp-21)
- date && _build/prod/rel/app/bin/app start: < 3 seconds

As you can see, the Elixir releases with runtime configuration are way slower than distillery (which has the same runtime configuration). By accident I tried releases without runtime configuration and noticed this was much faster (in the same range as distellery - my feeling says it’s still a bit slower, but hard to measure/benchmark correctly so don’t trust me on this)
(Just as confirmation, my runtime configuration does nothing special, only reading environment variables and parsing a 4 values to Integer. It’s the same configuration that I use in distillery as well)

I’ve also tried: env MIX_ENV=prod mix phx.server --preload-modules which should preload and the startup speed is in the same ballpark as distillery, releases without runtime configuration.

So my conclusion is that there is something wrong with elixir releases with runtime configuration.
Maybe that’s easier to confirm for someone else?

jola · September 10, 2019, 1:20pm

Best thing would be to set up a repo that reproduces it. The core team would probably be happy to look at it.

I hope you find an answer

tcoopman · September 10, 2019, 1:21pm

Yeah, I would love to do that, but unfortunately the code is private.
I’m posting here, because it seems that I’m not the first one running into this, so hopefully someone else noticed this too and can post their code.

odarriba · September 10, 2019, 1:55pm

This happens to me also, it’s a private project with a high number of modules, and cannot publish it open source.

However I see the same issue you are having, but in a different scale (because of the number of modules, which is totally understandable).

As @jola said, it is expected that Releases take longer than mix phx.server, the real issue is that mix releases are taking 3x time the start time of a distillery release in some scenarios (we have 4 elixir projects in prod, and only this one is suffering this issue).

In our particular case, we tried removing some aspects like external libraries (AppSignal for example) in case it was causing some trouble starting the agent… but no luck. We end up rolling back to distillery until this gets solved or we have more time to investigate more to fill a proper issue in the core.

if someone thinks about anything we can look at, I will be pleased to try things.

tcoopman · September 10, 2019, 1:58pm

@odarriba were you using releases.exs for runtime configuration? If you were, could you maybe test without that and see if it is faster?

odarriba · September 10, 2019, 2:05pm

We tried, and it reduced some of the overhead time, but it was also 2-3 times more time than with Distillery

tcoopman · September 10, 2019, 2:19pm

I’ve created an issue for this, because it seems that more people have this problem and that it’s not actively tracked at the moment: https://github.com/elixir-lang/elixir/issues/9337

tcoopman · September 10, 2019, 4:39pm

@odarriba maybe you can add your experience with the slow releases to the issue as well. I’ve been able to reproduce an open source example, but the slowness is related to the releases.exs as well.