Rethinking app env

For all my years of programming in at least 8 languages and no less than 30 frameworks the only ever meaningful answer to this question for me was…

Because it is easier to consolidate in a single file.

Sometimes there’s a secondary answer which is “this is really important to setup on boot and isn’t expected to change” but that’s tangential and not directly related. I would argue that this “configuration” centralization is useful because library defaults could sometimes conflict with your own code artifacts (imagine if you were oblivious of configuration and figured your app needs the exact same hierarchy of options that ecto needs – as a random example).

How would you answer your own question, by the way?

But what exactly is being consolidated? Some pieces of data go to config scripts, while others are provided through regular code. So config scripts do not really contain all the parameters of our system, not even all parameters we pass to external libs.

I have no answer to that question. Config scripts seem quite arbitrary to me. A bunch of unrelated data is stuffed together for the reasons which I can’t fathom.

2 Likes

I would say “the bizarre artifacts of your current tech” which in the case of Erlang & Elixir would be app names, different environments (dev / test / prod). And I am probably forgetting a few more.

I would venture to say that everything outside of these really should be function parameters or module attributes.

However, in his blog post about mocks Jose gives a really good example for a Twitter client implementation that should vary between environments. But yeah, that depends on the tech’s peculiarities, in this case different environments.

So would you agree that only your tech stack’s specifics should be a subject to the current incarnation of configuration in Elixir?

I’ve argued at lengths in my recent post that IMO most of the data stashed in config scripts doesn’t really belong there. I think that the only viable reason for putting something in a config script is if a library requires it during it’s boot (e.g. Logger config), or during compilation (which I think is a really bad pattern).

I believe that in most other cases we’d be better off if we consolidated the data by their purpose (e.g. provide all endpoint parameters in the endpoint module, and all repo parameters in the repo module).

So I definitely agree that there are legacy reasons for stashing data in config scripts. However, I also think that generators, such as mix phx.new store too much data there, and by extension inspire other developers to overuse config scripts. I feel that this excessive usage of config scripts is the main cause of the problems people experience with Elixir configuration.

2 Likes

Agreed 100%. I haven’t been responsible for deploying to prod but I observed and listened to people doing it and like you, also found most configurations to be really out of place.

Nailed it. The problem then becomes, should all these modules have mandated locations? Say, lib/boot/<library_boot_module>.ex, maybe? And if it’s not mandated, wouldn’t people just splatter all that information all over the place and make the situation even harder?

EDIT: Just remembered about Rails’ config/initializers/ paradigma. It seems like a decent idea to duplicate in Elixir, don’t you think?

What about situations where your config may come from an external source like etcd or a database? Would you then make each of those modules responsible for talking to an external service to get their config? How would you handle a library you didn’t own needing that config?

I feel like this sort of situation warrants a centralized place to get config. Because otherwise you make each of the configurable apps responsible for knowing how to get their configuration.

I personally think they shouldn’t. In most cases, a library should take it’s parameters as function args (or return value of callback functions), and not enforce how this data is provided.

I would argue that using config scripts is splattering the data all over the place. For example, consider endpoint parameters which are by default sitting in four config scripts (config.exs, dev.exs, test.exs, prod.exs), but the ultimate source of truth is still endpoint module, where in init/2 that config data is further enhanced. Wouldn’t it be more consolidated if you listed all the endpoint parameters in init/2?

It’s been a really long time since I’ve worked with Rails, but my memories of its configuration and initializers are not good memories :slight_smile: I prefer an explicit mechanism which fits well with the standard way of writing Elixir code. We already have that for our applications, and I think we could have the same for booting of the entire system.

Just call something like EtcdClient.fetch(...) or Database.fetch(...)?

In a simple scenario, there could be a wrapper module for fetching stuff from the configuration storage. Let’s call it EtcdConfig. Then other modules, such as endpoint and repo call the functions from such module.

In a more complicated scenario, you might want to cache these parameters, maybe periodically refresh them. In such cases, I’d use a dedicated ETS table (but not app env), and the implementation would become more complex, involving a couple of processes. But still, from the client’s perspective things would be simple, as client modules would just call well defined functions from the module responsible for fetching the data.

That depends on how the library accepts values.

If it accepts them as function parameters, or callback function return values, then it’s as simple as ExternalLib.some_fun(fetch_value_from_wherever(...)).

If it requires app env value but not during its boot, then it’s more complicated, but it can be handled during the client app’s start/2 callback.

If the library requires app env value during its boot, then currently we have no good solution, and need to resort to various kind of trickery.

You lost me here. What sort of situation warrants a centralized place, what would such place be, and what belongs to the config to begin with? This words gets thrown around a lot, but I haven’t seen anyone explaining when some piece of data is consider a “config”, instead of just plain function argument.

2 Likes

IMO in Docker / Kubernetes / container-tech-of-the-month scenario. However, thinking of it, there’s no problem to actually supply some kind of module attributes at deployment and just compile in place… but then people would complain they don’t want compilers in their container images. Maybe cross-compile preliminarily and just deploy the release later?

I want you to know I am fully on your side here – just trying to get a better grip at this whole situation (and I felt the official proposal thread wasn’t a good place to show my not being very well informed).

It definitely looks like everything can either be module attributes, or supervision tree init parameters, or just full pure functions.

That being said, how would you pass Ecto.Repo connection parameters only once? It’s really impractical to pass them around at every call… Probably something like YourApp.Repo.set_connection_params? But would not that mandate flushing of pools and reconnecting due to the possibility that the connection parameters might potentially now point to an entirely new database?

EDIT: months ago I found Discord’s FastGlobal library and that caught my eye as a way to have a mutable global state with extremely quick reading times. Maybe that’s one possible way?.. But then again, that would just shift the same thing from config.exs to somewhere else… Eh.

1 Like

I’m focusing on Elixir code exclusively in this thread. I don’t question the fact that in a more complex system things need to be managed in some dedicated storage. A couple of years ago we used etcd to configure a heterogeneous distributed system, and that worked well for us. But these external storages are outside of the scope of what I’m discussing here. In this thread I’m questioning the need to define a bunch of arbitrary data in elixir config scripts (config.exs and friends).

Because some libraries require app env during their boot, and there’s no easy way to run custom code during the system boot, I think config scripts are currently the best place to provide such data. Other than that, my opinion is that most of the data can and should be provided in regular code.

Something like this:

# config.exs
config :my_system, MySystem.Repo, adapter: Ecto.Adapters.Postgres

# repo.ex
defmodule MySystem.Repo do
  use Ecto.Repo, otp_app: :my_system

  def init(_arg, default_ecto_opts), do: {:ok, Keyword.merge(default_ecto_opts), db_config()}

  defp db_config() do
    [
      hostname: ...,
      database: ...,
      ...
    ]
  end
end

If you don’t need to change these parameters at runtime, you don’t need fastglobal. Just define a function which returns plain constant.

3 Likes

I think this is the right answer. I do also think though, that people developing ad-hoc solutions to providing different configuration depending on the environment (test, dev, prod) is undesirable.

Let’s take Repo as an example. I am not talking here about having different host names and credentials - that is trival to solve in the environment variables. But rather differences that would belong in code such as using a sandbox pool in test, using longer timeouts in production, different pool sizes etc. Just looking at my current project and its various environments, I could easily end up with a few tens of lines of configuration code in my Repo module, which would tend to discourage it being used for anything else - so it just becomes another configuration file.

I have to have some structured method to handle these compile-time environment differences. Maybe all that needs is a config macro and best practices disseminated through the generators, but we have to have some real alternative - we can’t just say “put it in code”. Well, actually we can say that and people will get on with it but their solutions will be very divergent.

1 Like

I think that this cuts to one of the root causes why config scripts are used so much.

So to be clear, my opinion is that config scripts are way overused, and contain a lot of stuff that doesn’t belong there. I consider them a pile of bloat arbitrarily thrown together, and I think that they often make the code more difficult to understand.

IMO, there are a couple of reasons why so much data ends up in config scripts:

  1. Some of our flagship libraries (Phoenix and Ecto) promote it.
  2. Some libraries require it.
  3. There’s no obvious or convenient way to specify variations between different mix envs.

To be completely honest, at my company we also overuse config scripts, precisely for the reasons stated above. So for example, even though we’re aware of if Mix.env() and how/when we can use it, if a constant varies between different envs, we usually just stuff it to config script.

I further believe that this convenience subtly creates a tendency to put even more data into config scripts. So for example, my colleague recently argued that some MFAs belong to config scripts, because they “feel” like config, even though MFA is clearly code, not config. This was for me the direct motivation to write that lengthy post about config scripts, and to actively start questioning them.

To be clear, I used them myself a lot, even though I was never quite comfortable with them, which is why I’m increasingly starting to feel that they are misleading people and causing problems much more then they actually help because:

  1. Config scripts are not runtime friendly, and cause confusion when used with OTP releases.
  2. Parameters are not consolidate anymore. For example, endpoint parameters are now specified in at least four config files.
  3. A bunch of unrelated data is thrown together.

Therefore, I feel that instead of adding the additional complex machinery to address the issue number 1 (which is just one issue of config scripts), it would be much better if runtime configuration through regular code was promoted and assisted.

In particular, I’d like to see:

  1. Official helper macros to simplify expressing small variations between mix envs in regular code. Perhaps something along the lines of this macro, or something different/better.
  2. mix phx.new, Ecto/Phoenix docs, Elixir docs, and official getting started guides favouring runtime configuration over config scripts.

This will not solve all the problems, but I believe it will solve most of them, and that it will guide the community to write their code and libraries in a better way.

Now, some members from Elixir and Phoenix core team have mentioned that many people were further confused when init/2 callbacks were introduced. Personally, I don’t at all buy that this means that runtime config is confusing. The thing is that prior to init/2 a typical Elixir project had at least four files where parameters were specified (config, dev, test, and prod.exs). And then the fifth one was added. No wonder that people found this even more confusing.

But until we guide people to provide their parameters as much as possible in the regular code, it’s unfair to say that runtime config is confusing.

I should also state that I was not a fan of init/2 when it was proposed. Personally, I felt that endpoint and repo should just take their parameters as argument to start_link and child_spec/1. This has issues with hot code reloading, but that’s an advanced scenario anyway. My impression is that with init/2, Elixir/Ecto/Phoenix team decided to make the runtime configuration more complex to simplify advanced (and arguably infrequent) scenarios at the expense of more complex interface for everyone. My feeling is that this was not a good tradeoff. I suspect that the callback style interface of init/2 also adds to the confusion people have.

In summary, I think that Elixir/Phoenix/Ecto team historically favoured config scripts way more than runtime configs in regular code, and that’s why we’re here. Perhaps, instead of adding more complex machinery to config scripts, a better way would be to assist and promote runtime configuration through regular code and plain old passing of arguments to functions. I feel that this requires much less interventions in the language and that it can take us very far, although it will admittedly take time to move the community to such style of configuration.

8 Likes

What was the issue with hot code reload in that scenario?

See here for details.

Earlier in this thread, the question ‘what is configuration’ and the related question ‘what belongs in the configuration’ were asked.

My two cents: Configuration can be defined as ‘the subtle reshaping* of the behaviour of this instance of your application to allow it to work well in the concrete environment it will run in.’

*: (the Latin verb configure literally means ‘to mould’)

So if some behavior will be the same in all your app’s instances, it should be described in your codebase itself.

If some behaviour differs per instance, describing it in your codebase will result in code that is hard to re-use: hard-coding locations and authentication details of a database, for instance, would not make sense inside the code.

So this is where I would deliniate the difference between config/not config. In many cases, though, something that is configurable for one of your app’s dependencies is fixed for all your app’s instances,and should therefore not be part of your app’s config. But I do not believe Elixir has a way to support this: These types of config end up ‘bubbling up’ to the final user of the library/app stack. I think this might be the source of the ‘overuse’ of config files, and that it is something that could definitely be improved.

2 Likes

@sasajuric thank you.

1 Like

So then all the stuff in config.exs is not configuration at all? Also, all the things which are the same in dev/test/prod.exs are also not configuration? Finally, the things which are fetched from external sources (e.g. at Aircloak we fetch database parameters from a json file) are also not configuration?

This also confuses me. Config scripts are a part of the codebase.

1 Like

Indeed, I think that many of those are not really configuration at all, unless you think that it might be changed for all or some of your app’s instances in the near future (which could be called different instances of the same app). What I mean with (not) ‘in your cosebase’ is that not all configuration, especially not your production configuration, is part of your source code repository (you do not copy them over using git or similar tools). Also, configuration files always lives at the peripheral of your system, i.e. outside of the ‘lib’ folder for Elixir systems.

Let me elaborate a little more, since I finally am back on the computer rather than on my phone:

Files like config.exs contain configuration that might be applicable to all your app instances today, but:

  1. It provides default values that are overridden by more specific configuration settings in other configuration files. How much of this is applicable vs. how much of this should just happen in code is subjective; writing the default values here is more explicit (so it is easier to find the default value), but if adding a specific configuration setting is the only way to make a dependency library work, then I’d say the dependency is more leaky than it should be.
  2. It provides values that might differ between the app today and the app tomorrow (i.e. the near future). Examples of these might be the logger level or network connection strategies of a Nerves system, the amount of encryption rounds to use for your PKBDF2 (which you want to increase over time as hardware becomes better) etc.
  3. When there is no sensible default because the library does not care (the chosen value depends on the environment of the application (the (type of) system the app runs on) rather than the implementation of the application, but a choice has to be made. Settings like the Erlang Cookie, the Phoenix Endpoint Secret Key Base, the folder into which attachment files should be uploaded, etc. fall into this category.

There is obviously some overlap between (1), (2) and (3) in certain cases.

In this way, what is inside your config.exs is configuration. And what is in your dev/test/prod.exs is configuration. And the database settings you receive from an external location are definitely configuration because you probably are not doing this in the development- or testing-instances of your application.


An example of something that I think is ill-suited for specifying in configuration, and rather should be part of your application’s real code itself, would for instance be what locales you’d like to use for CLDR. This is a very interesting project, but the way you specify what locales your app requires is by adding this to your configuration. They even include a special ‘compiler’ so changes to its configuration are picked up as ‘code changes’. For the locales, I think it would make more sense if inside a module that would like to use CLDR, use CLDR, locales: [:en, :nl, :de] would be used, with the implementation of CLDR’s __using__-macro gathering the required locales in that way.

Another problem over the high amount of settings that libraries shift off to configuration files, is that configuration is effectively global for your application, so using the library in one ‘configuration’ for one part of your OS-app (like one of your BEAM apps) and a different one for another becomes impossible.

There are some other libraries I’ve used in the past that just don’t do anything, or even throw errors at app launch unless you’ve added certain snippets to your configuration, although I am currently not able to remember them (I also believe I did not end up using them exactly because this).


I like libraries that allow an (OS-)app-wide global setting with module-local overrides (by e.g. using the use Foo, config: ... syntax or similar). I also like what Decimal does with the Decimal.with_context(fn ->... end) construct to (temporarily) alter rounding behaviour.


I hope this gives some more context :slight_smile: .

2 Likes

So then this stuff, injected by phx.new doesn’t belong here at all?

config :my_site, MySiteWeb.Endpoint,
  # ...
  render_errors: [view: MySiteWeb.ErrorView, accepts: ~w(html json)],
  pubsub: [name: MySite.PubSub, adapter: Phoenix.PubSub.PG2]

How exactly is it easier to find the default value in a config script? I need to first know that the value is provided by config scripts (since in fact many values are not defined in config scripts), and I also need to know its name. So it looks like I still need to consult the code first, possibly also read a library documentation, before I can find the value.

How do I determine such values? The examples you’ve mentioned are the things which IME change maybe once or twice in a period of few years.

I’ve just checked the git log of one of our project’s config folder. In the past year we had some changes there. Most of them were additions of new properties, some of them were deletions, and I was able to find only one case where some value has actually been changed. And that change was due to a Phoenix upgrade, not due to “reconfiguration”. Our config scripts were more frequently modified due to making comments, than due to changing a value of some “configuration”.

We actually had more frequent changes of values typically provided in the regular code (e.g. Supervisor parameters, GenServer timeouts) than “config” values.

Perhaps we completely failed in organizing our configuration?

I’m confused here. The examples you mentioned don’t end up in app env (Erlang cookie is a VM arg, not provided by a config script), or don’t need to be in app env (secret key base).

I’m not much wiser :slight_smile: The only clear rules I can make so far are:

  1. Use config script if required by library
  2. Use config script if values change between different mix env

The first reason can’t be avoided (although opening up an issue with the library might help).

The second reason deserves to be questioned. Given that Elixir has other means of making a decision based on mix env, why are config scripts the best approach?

@Qqwy your point on cldr is very well taken and for Version 2 that is exactly the approach I am taking. Its much cleaner - but it took me a lot of learning to figure that out.

I am still a little uncomfortable with the complexity associated with creating user-defined module(s) to host the compile-time generated functions when the public API generated has a large surface areas as cldr does - but thats for another topic.

2 Likes