Proposal: moving towards discoverable config files

releases
elixir

#23

This would make me personally happy.


#24

Does this mean that on_boot would replace use Application.Config?


#25

@sasajuric theoretically there would be no need for Application.Config, as outlined in this proposal, but I still think we need to have Application.Config in case anyone wants to customize a release after it is assembled (and we can’t depend on Mix by then as well). In other words, I would still introduce Application.Config but it is for another purpose rather than file discovery (it could even be a separate proposal).


#26

@josevalim Right, I’m just trying to figure out what’s the choice here. So if I get it right, you’re now asking us what do we think about use Application.Config vs on_boot, where the latter would be used to explicitly denote the parts which are evaluated at runtime?

I.e. with on_boot, it would look like this:

# we're using mix config, as before
use Mix.Config

on_boot do
  # evaluated at runtime
  config :ui, UI.Endpoint, url: [host: System.get_env("HOST")]
end

# evaluated at compile time
config :foo, :bar, System.get_env("BAZ")

Is that the choice?


#27

I’m a fan of the on_boot/1 proposal.


#28

Almost. Let’s forget about Application.Config completely. The choices are:

  1. Introduce on_boot as an explicit block of code to be evaluated on boot (e.g. during a release or during mix app.start)

OR

  1. Restrict import_config to be static we can copy configuration files as is to a release and have the same configuration files be executed at both compile time and runtime without a distinction of “when” each is happening

#29

Ah, that sounds neat, and it seems like it’s a smaller change from the perspective of user devs. In addition, as others have mentioned it, it makes the execution context explicit, which seems like a better choice.

Why did you settle on the current proposal then?


#30

The current proposal was seen a smaller step. We could always add on_boot later and we thought restricting import_config was going to be a good thing nonetheless. But I guess the opposite direction is also true: we can add on_boot now and restrict import_config later.


#31

on_boot/1 sounds good :+1:

One question though, from the initial proposal I had a feeling that it’s a “hack” or a “workaround” that would temporary make stuff clearer without fixing anything, but reading the discussion I get a feeling that it’s a first step towards unifying the configuration and the workings of mix and releases, is that so?

There was an article describing the general direction Elixir is moving: https://dockyard.com/blog/2018/02/28/elixir-deployment-tools-update-february-2018

To remove the awkward transition between development and production, we really need our development tooling to be built on releases under the covers. In short, if you run iex -S mix or mix run, these should effectively be the equivalents of bin/myapp console or bin/myapp foreground. If you are always running releases, then there is no transition to be made between development and production.

I very much like the idea, am I right to assume that it’s still the end goal?


#32

i think the core problem with elixir’s use of an executable script to populate the application env and the use of Application.get_env/3 as the means to retrieve config is that it makes it acceptable and idiomatic to put configuration in config.exs. this is fine for examples and most standalone apps but it breaks down for users who have to integrate elixir into existing ecosystems where things like vault/etcd/dynamodb/encrypted json blobs/etc are used. at two different jobs i’ve had to fork phoenix, ecto and other applications to break their dependence on an evaluated config.exs to allow for use within the environment that was already established. at my latest job i’m just not using elixir because the burden to introduce it was too high

i don’t think there’s anything wrong with either of these proposals but i think incremental improvements to config are not enough. the risk is that it gets good enough that no further work is done and the real issue (that of libraries not exposing their internals in a way that is flexible enough for more than just archetypal users) is never addressed


#33

Not sure about other apps, but with ecto/phonenix there was always a workaround without a fork. The way I solved it in older versions was to populate the missing pieces of app env in my app start callback before starting the supervision tree. It wasn’t very pretty, but it worked. These days, we have a proper solution for Phoenix/Ecto in the shape of init/2 callbacks (see here and here).

IMO overreliance on app env is not just Elixir’s problem. I’ve seen it happen with Erlang libs as well. In fact the most annoying episode I had was with riak_ensemble which loads the database storage location during its startup (so before my app startup), which made it very hard to establish a local cluster by starting multiple instances from the same folder. In the end, I resorted to included applications to work around it, but I really disliked it.

I share this sentiment too. I believe that in a mid-to-long term it would be better if we motivated developers, and in particular library authors, to avoid app env unless absolutely necessary. The next version guides already have good recommendations, but I also think that the generators (mix new and mix phx.new) could push people towards configuring at runtime, to promote the culture of preferring runtime configuration via function parameters and/or callback modules.

That said, both proposals look fairly lightweight, and they should help solving at least some common confusion and ugly workarounds (such as replace_os_vars), so I think they are a good short term solution to frequent problems.


#34

Love the idea of questioning this point because it has definitely been painful in the past when configuring Elixir apps (especially umbrella). The main problem that we encountered was related to the fact that there are multiple ways to configure and app, and even within the configuration files themselves, you can use System.get_env or also load_os_vars.

Since the release, as far as I know, is one. The all process could be simplified by doing the following:

Instead of on_boot, that does not seem to be explicit enough for developers that come from other languages such as Ruby, why not having something like:

use Application.Config

production do
  config :ui, UI.Endpoint, url: [host: System.get_env("HOST")]
end

dev do
  config :ui, UI.Endpoint, url: [host: "localhost"]
end

test do
  config :ui, UI.Endpoint, url: [host: "localhost"]
end

Since the release is one also for umbrella applications, the developer can be ‘forced’ to move all the config in one single place (maybe in the root folder under /config/) so that:

  1. It is clear where the configuration is and should be defined
  2. It is easy to understand what will be loaded in production, dev and testing
  3. The order by which the configuration keys and values are set is more obvious (currently, umbrella apps load the config in a funny way and may override some keys without any warnings)
  4. We have less confusion about what happens to the code that is not defined in the scope of on_boot/1
  5. We generate a module from each of those calls that could be compiled and replaced using hot swapping

I am new to Elixir and compiled languages so please bear with me if I say something completely senseless

Nice discussion tho!


#35

We are not sure that will be exactly the end goal but that’s the direction we are going. The goal of this proposal is to improve interoperability with releases. We will continue improving bits of the language until releases are effectively part of the language.

Besides what @sasajuric said, there is also work happening on this front as “configuration providers”. So we will make sure we go all the way and not only half way. Regardless, the built-in provider should be as solid as it gets.

The problem we are trying to solve though is not about the understanding of configuration (it probably deserves a separate discussion) but rather how to control what is evaluated at compile time and what happens at runtime, regardless of the environment (dev/test/prod). Introducing blocks for dev/test/prod does not address the root issue here.

We agree there are issues with umbrella though. As mentioned in the original post, those will be discussed separately, otherwise we end-up mixing too many topics and the discussion ends up too dispersed.


#36

I assumed the import_config change was a required step for supporting releases. If just on_boot were done, wouldn’t a release still not know where to pull in config?


#37

I am unfortunately late to this discussion, but I do have some thoughts:

I dislike on_boot, I fail to see what problem it is actually solving. The config file is already evaluated during boot, so this is effectively a misnomer - and I’m completely against something like this only applying to releases, since it is a big part of the motivation for these improvements, to streamline the transition from dev to prod.

I’ve seen numerous mentions about the build-time/runtime confusion - but that is a problem beyond the scope of these changes (in my opinion). The biggest win we get here is by closing the gap between dev and releases, getting rid of dirty hacks like REPLACE_OS_VARS.

In general I agree with @sasajuric about configuration, but I personally feel that we have to, at a minimum, address the biggest pain with configuration today, which is dealing with releases. I’m all for eventually restricting the capabilities and pushing the community in a better direction, but we have a very real, very painful problem right now to deal with, and I would rather see us simplify the situation rather than add even further layers of complexity to it. Simplification here meaning that we no longer have two different ways of configuring a system between dev and prod, and instead we have (more or less) the same old system everyone is already familiar with in config.exs.

If we had configuration providers, this is about improving the built-in provider. Custom providers is a separate topic and I believe @bitwalker was working on this.

Yeah this is something I added in Distillery as part of the work along these lines - Mix.Config or Application.Config would just be another provider, and custom ones could be built to fetch configuration from other files, or even external systems like etcd/Consul/etc.


#38

Fist of all, I think there is a very important point to make here, in order to assure this reasonable attempt of improvement is not going sideways:

Currently, the {:system ,value} tuple exists as a de-facto standard in many situations. However, it is usually only used in the production environment. More specifically, the config/prod.exs. However, this is less of a problem of Elixir itself but more of the paradigms introduced by frameworks that generate configurations this way. If this was used consistently across environments, I think more developers would automatically start correctly offloading those dynamic parameters into environment variables.

I understand the appeal of using optional - as in if-exists approach to - configurations and it certainly is a way of solving the issue at hand, alebit by introducing a duplication of functionality. - In my opinion. As far as I can see (and I’d be happy to be proven wrong) this proposal only changes a bit of complexity (when is configuration part X actually evaluated?) into another bit of complexity (which method do I use to configure my application?).

Bottom line is
As mentioned in the Summing up of the original post, the idea is to make calls like System.get_env/1 transparent across environments. Which it already would be if it was used in config/config.exs alone instead of only using it in config/prod.exs and introducing different behaviour in the other dynamically imported environment configs. - Which is an issue with the software generating that structure, imo.


#39

We have two scenarios here:

  1. Config files are not evaluated in a release at all. This means that config files are only evaluated at compile time. The evaluated configuration is there when you boot your system but you can’t evaluate any code during boot.

  2. Config files are evaluated in a release. If you are doing this today, it requires a shim of Mix. This discussion is how to improve things so we don’t require a shim of Mix.

We want to address the second point but the question is: how do we control what is evaluated during boot/release? The initial proposal is about evaluating the exact same config files that we evaluate during a release. However, as we have seen examples in this thread, folks may even call System.cmd/2 in their config files. Simply executing all of the config files inside the release is prone to cause issues. A better approach would be if we could explicitly say what runs on boot. That’s what on_boot is about.

So when you say the config is already evaluated inside boot, I am wondering if you are thinking about the current state of 1 or 2? But those are exactly where we don’t want to be.

If many are talking about those points, it is a very strong sign that it is a problem in the scope of the proposal. When we introduced overridable and optional callbacks, there were a lot of confusion, so we broke the proposal in two and everything was clearer. So in my opinion something has to be done, even if that something is rewording the proposal or breaking it apart.

Although I think the issue does run deeper. Until very recently you couldn’t configure a release using mix configs, so the majority of developers treat mix config as compile time. But now we are saying that the config files will run both at compile time and runtime (on boot), this means we are adding ambiguity to the config files. We should discuss how this ambiguity should be addressed.

I think the issue with the current proposal is that it focused on making all of the configuration also run in a release. However, this has two issues:

  1. There is some code that you don’t want to run in a release
  2. The code that you want to run in a release is actually the smaller part of the configuration

#40

We want to address the second point but the question is: how do we control what is evaluated during boot/release? The initial proposal is about evaluating the exact same config files that we evaluate during a release. However, as we have seen examples in this thread, folks may even call System.cmd/2 in their config files. Simply executing all of the config files inside the release is prone to cause issues. A better approach would be if we could explicitly say what runs on boot. That’s what on_boot is about.

The reason we have a compile-time/run-time dichotomy in the first place is because people couldn’t use their Mix configs in a release - but that is where virtually everyone starts, and they find out the hard way that, oh actually, you can’t do that, because reasons. My take on the situation is that people already assume that configuration is unified by default, and only change that perspective once they have had to go to production. I think it represents not only an easier system to learn, but a pretty painless thing for existing applications to adapt to (anyone already deploying releases would be unaffected by these changes, barring things like invoking git inside their config.exs, and I think such things could be dealt with via warnings as part of the transition for this process (you’ve already invited discussion on that topic originally).

In other words, I think releases should be in a position to work like Mix with regards to configuration (evaluating them during boot). I don’t think there is value in two different sets of config files, or in a special DSL for wrapping “boot-only” config bits. Those things should be pushed into application code then, not defined in config.exs.

If many are talking about those points, it is a very strong sign that it is a problem in the scope of the proposal. When we introduced overridable and optional callbacks, there were a lot of confusion, so we broke the proposal in two and everything was clearer. So in my opinion something has to be done, even if that something is rewording the proposal or breaking it apart.

Not to diminish anyone’s complaints, but having been dealing with every shade of configuration problem with releases for years now, I have seen far more people stumble against the fact that config.exs did two different things between dev and releases, than that it doesn’t do two different things. I think the current confusion in this thread is a symptom of the solution we’ve had in place for a very long time now, and that this discussion is derailing into solving a problem which shouldn’t exist in the first place. To put it more plainly, I think that by unifying configuration (i.e. one config file, one set of semantics), it becomes easier to understand, and is easy to adapt to for older systems.

Configuration should be mostly static in nature - if people need to execute “boot-only” code, it should go in their applications, not in the config files. It is not clear to me what else would fall under this umbrella and not belong in your application code.

I’m not sure I agree with the premise that either of these are true and fall under the umbrella of “configuration”. Do we have some examples?


#41

That’s definitely not the reason. Some configuration are truly compile-time. For example, if you change the ecto adapter in a release, it won’t work:

config :my_app, MyApp.Repo,
  adapter: (if System.get_env("DB_ADAPTER") == "mysql", do: MySQL, else: Postgrex

So the configuration above is a compile-time configuration. There are many other examples. For example, the mime configuration mentioned in this thread. Some of the Logger configuration keys. Even the host configuration for force_ssl in Phoenix was compile time until recently fixed. This will exist regardless of the configuration provider and it is rather decided by how libraries consume configuration. This was less of an issue in the past because mix config only applied at compile time anyway.

It is a problem that definitely exists and we have definitely seen confusion about this on Elixir, Ecto and Phoenix side of things too.

Then this is an argument for not supporting System.get_env/1 in config files in the first place. The only reason why we are doing this is because people want to read configuration such as HOST, PORT and DATABASE_URL at boot time (i.e. when the release starts).

  • For 1: see the git example in thread

  • For 2: because of what you just said above: configuration is mostly static in nature. Get any production application and notice that the number of configurations that actually use System.get_env is a small part of the overall configuration

I have a hunch that we are talking past each other here. :slight_smile:


#42

EgadsThisLongThread… My opinion slightly changes and refines by the end of typing this post, feel free to jump to the end for the TL;DR but a lot of context is missed then.

Yeah this is a good example.

In C++ we traditionally have multiple levels of such ‘configurations’ and information available to the program. First is of course the build system and it’s own data, like building in Release, this information is not available to the program unless explicitly passed in (or one of the few built-in variables) via the ‘macro’ (more like global Elixir module attributes, not like Elixir macro’s), thus being explicit, and when passed in this way they are only available at build-time, not at run-time. Next you have the config file generation, which is a source file with the above ‘macros’/global_attributes baked into the source for access at runtime, but overall are still constants. Then of course you have the pure runtime configuration data such as the system environment (which the defaults of course could be pulled from the macros/attributes from before) or database or whatever else, of which these might only be read once then never again, then you have the configurations that are accessed multiple times in the program and thus can be changed with inpunity (and perhaps even scoped to certain areas so different code can access the data differently).

Elixir having a way to manage this simply would be quite nice.

For example, the git rev-parse ... bit would be ran by the build system into a global named attribute (though these are scoped, but globally accessible from within their scope of compiled files), as well it would probably be baked into the source config definition file as well.

Things like url host’s and so forth would likely have either a hard-coded or a macro/attribute default, but use an entry from the system environment if available.

Etc… etc…

And things like this, which are application specific, would literally just be passed in arguments to the libraries setup code (or in Elixir terms it would be passed in as options to the supervisor that is added to the application).

This is why I actually quite like the C++ forced style that configurations that are used at different times have to be put in different places, which is why I like the whole staged config style entirely.

Elixir kind of has this already, build system configuration is put in mix.exs, however all three of compile-time, load-time, and run-time configurations are combined into this mess that you cannot distinguish between so you don’t know which is safe or not to edit at load time (the system environment) or run-time (like Application.put_env) or so…

Things like this just yell to me that some of the configuration belongs in the system environment and NOT in the config files…

Really though:

  • Build system all configs should be in mix.exs
  • Compile-time global config should be in the config.exs file.
  • Compile-time environment-specific config’s should be in the system environment that defaults to the config.exs if unspecified.
  • Start/Load-time global config should in the supervisor start’s as options (that maybe should default to the compile-time config).
  • Start/Load-time environment-specific config’s should be in the system environment or via passed in command-line options that defaults do the supervisor start options (that maybe should default to the compile-time config).
  • Run-time global configs should be in the Application environment (defaulting up as above, which could of course pull and be updated from a database or something, though a callback that code could register for changes to specific application environment keys would be nice, so a new module would be good).
  • Run-time scoped-configs should be passed in as arguments to functions or be some kind of scoped access via some new module to handle it all (delegating up as always).

Exactly, these kind of things belong in the system environment or passed in to the build system or passed in to the starting server depending (that could be updated at run-time as well).

I still think most/all of this stuff belongs in being passed to the startup supervisor’s for the relevant areas (or ETS or so).

Absolutely this, the system environment should be the default place to get many of these things, not some compile-time baked config file.

Or none depending, hardcoding configs belong as options to supervisors, user-configurable configs should be in the system environment, etc… etc…

I’m constantly amazed at the number of libraries that don’t allow load-time config adjustement, and especially the ones that don’t allow run-time config adjustment (irritating ueberauth’s strategies hardcode options into their modules, like what the heck were they thinking there…). A lot of this needs to be enforced better, somehow, so it is harder to make stupid decisions like that…

Optimally I’d prefer user-configurable is preferred to environment-configured is preferred to options-configured is preferred to build-time-configured is preferred to hard-coded, but it is amazing the number of libraries that do either the hard-coding or build-time-configured, thus forcing you to build new code and hot-swap-and-pray it into prod (or reboot the server and take it down during).

I could foresee almost all options that make sense to be changed at run-time existing in the Application environment, but there is also a system to ‘listen’ to changes, important for, say, a database connection so it can spool up the updated connection on demand instead of checking the application environment on every-single-request, which is easier than the user looking up the code to call into the library to change the connection and have it update it’s internal bits that way.

Not really solved because a lot of libraries and main elixir things don’t support the {:SYSTEM, "blah"} setup to access them, nor an easy way for a fallback to be specified if not there, so you end up having to write the code yourself, every-single-time.

Ugh, this is, again, what the system environment is for, it’s designed for this kind of stuff…

If it’s on disk and accessible by your program then it’s already more ‘open’ and available then environment variables (and you can encrypt environment variables just as easily, while also scoping their access and even able to clear it on program start after it read what it wants).

Or from a database or from a network server or from joe-bobs-config-store or from whatever.

Absolutely this, I store a lot of configs in the database except the libraries that are broken in their configuration handling (like a lot of ueberauth’s strategies).

Would be useful, especially if there was also an on_build/on_compile and an on_runtime somehow (though that would would likely need another interface that is already mostly satisfied by Application/put_env and so forth, except for callback support on change).

That’s how it feels to me too, it doesn’t go far enough into Doing Things Right, which if it is being re-made then it really should be re-made all the way.

This, so very much this… The System Environment is already a great place for this for build time and load/boot time as it is scoped, settable by other configuration methods (database, environment, another server, encrypted blobs, whatever), and for run-time the Application Environment is already pretty decent, with perhaps some enhancements. I’m not strictly opposed to a config file (even an ‘executable’ config.exs one), but it should not be environment dependent and is used as a fallback/defaults only (where the command-line or system environment can override anything and everything specified within it either at build time when calling mix or boot time by calling myapp foreground or so). Preferably I’d prefer a different file format entirely, instead of the current config system each application’s config should be included and inside the config’s should be all the option that application wants/needs, the defaults they are set to, and help text that is used when the program is queried about the option, preferably even with restricted types it can be and so forth, which would make it much easier to validate options, make interfaces for them, etc… etc…

This, I’ve seen it happen quite a number of times, it just gets ‘good enough’ where most have this constant irritants in it but never big enough to encourage another rewrite. Java’s system is a bit abhorrant, but it does indeed support most of the options I want above out-of-the box, from an (XML) base config file of defaults, can pass options in the command line, can set them in the environment, can use them from a vault, can access them from a database via plugins, etc… etc…

That still only handles boot-time options, not run-time dynamically reconfigurable options, not that Ecto has many, but phoenix definitely could, and other third-party libraries VERY much can (ueberauth’s strategies is a constant thorn in my side). I really like how CacheX does it for example, or my own TimeAfter or so forth libraries, there are default configs, then options that override those passed to the supervisors, and that can further be overridden by changing the application environment, and that yet still can be overriden by passing further options to the actual functions.

Yeah for it to support this use-case it either needs to poll the Application environment constantly, or you yourself needs to tell it to check it for changes, or if the Application environment has a callback then it could listen to it and know when to update it’s internal state based on the environment setting.

I’m not sure about this, Application.get_env and it’s ilk are very useful and great for even down to run-time configuration, if and only if libraries updated when you updated the values within it (this part needs fixing). Some things access it on every access, some don’t because it is infeasible too (like an active database connection).

I still find this whole embedded environment method a horror coming from other languages, it seems so very limiting for the user to change… >.>

/me is starting to lean more and more to the Java way, though preferably sans XML…

And if a library is added then you need to update the ‘global’ main configs to include it, even if it had it’s own sensible default config…

And what if you are wanting to change the url host on dev so cross-machine access can be tested, now you have to change the config and recompile, instead of just setting a temporary environment variable and running it like elixir_args='-Dui.url.host=0.0.0.0' mix phx.server for a single invocation (as is the Java’ish way).

You really only want to do this if you have no other option, updating the runtime application environment is far superior.

Each umbrella app should hold and manage it’s own configuration. Things that are ‘configured’ from other apps in the umbrella should be passed in via arguments to supervisors or ETS or whatever.

Two different issues indeed, one is ‘when’ the configuration is needed by the system, and the other is ‘where’ that configuration comes from. Both really need to be solved far far better than they currently are though.

Very, not being able to control where a configuration value comes from and how to update it is a freaking major pain, the amount of things that get hardcoded at build-time that should be changeable at boot-time or even run-time is astoundingly painful…

No, no it’s not, it would be nice if it was, and sure the usual phoenix/ecto and so forth libraries support it, but so so very few third-parties support it. Compare this to Java where third-party libraries have a built-in default config, of which the main application/jar can override some of those defaults with it’s own (without respecifying the whole thing), and of course you as the user/server can override those on the command line or system environment or a vault or whatever else, all without the third-party application needing to be touched, they were never built with that in mind, etc… etc… (assuming the third-party library uses the standard Java configuration system, which almost every single one I’ve ever seen does).

I don’t see why? I use the system environment excessively for production and development (and rarely, but occasional on test too).

Yeah, they probably should be broken apart. One thing to determine how and where a config variable gets set, and another to figure out How To Determine ‘when’ the config is accessed.

Very much this yes.

And honestly I really really hate the environment concept. Just let me define the variables however I wish in my own context, like if I want to use a default json config file or something to override some other defaults then I could do ELIXIR_CONFIG=prod.json mix phx.server or so to specify some things to override the internal config defaults…

I’m thinking such namespace options as exampled shortly above would be quite useful differentiators.

Very true, this would just be a ‘release’ set of configs (that of course delegate defaults to the global build configs, but not boot or runtime for obvious reasons).

In the C++ ecosystem you ‘do’ indeed set all your options in the build script (say the CMakeLists.txt file, which is like C++'s mix.exs file), and yes you can indeed override them from the command line for the build and hardcoded boot setup (and many libraries allow for boot-time overriding via environment/commandline as well). Again, the JVM does not handle this bad at all…

Please… Then more libraries would accept options in a non-build-time-only way…

So you favor the JVM style then (which solves what you seem to be describing what you want solved, it handles where the config is set, but it otherwise unified, it makes no distinction upon ‘when’ they are accessed but you can override it in specific scopes as well)?

That is how the JVM’s is. A static XML file of ‘defaults’ (please don’t use XML… or YAML, or JSON… erlang’s syntax I actually quite like though, or elixir’ified) that can be overridden in a variety of ways.

Yeah all this kind of stuff I think does not belong in the config.exs, but rather in either the mix.exs if it is build-time or in a supervisor options list or ETS or something if boot-time.

This is where the JVM style of a static file that can be overridden by a multitude of ways (including the system environment or command line or so) is immensely useful.

Honestly I think that should be a generated file, I.E. just in mix.exs that is used to do whatever (name output files or whatever) as well as could be dumped into a module at compile-time, same as both C++ and the JVM work to do the same things. It should absolutely not be in the config as it is very very static once a build is complete.

Precisely, most should be either baked in to code via mix.exs or via explicitly passing in options at boot time, the things that are changeable should probably use the JVM method of a static default config file that is overrideable via a vast variety of methods.

But yeah, I think I prefer the JVM model at this point odd as that feels for me to type, it’s similar to how C++ does it but more built-in, as it should be…