Rethinking app env

sasajuric · May 22, 2018, 1:53pm

This is a spin-off from the discussion about the new config proposal. I’m replying to this post by @michalmuskala separately, to avoid noise in that thread.

Currently, it seems to be a bucket for all sorts of things, which includes system configuration. I definitely agree that it’s not a good place for operator configuration.

This sounds vague, could you elaborate what kind of loose coupling do we achieve by e.g. having pubsub: [name: UI.PubSub, adapter: Phoenix.PubSub.PG2] in config?

What is that “everything else”? What’s the criteria for defining a parameter in the config, and not in the code? For example, why should a pubsub name and adapter go to config, but e.g. supervisor name and restart intensity be in the code?

Qqwy · May 23, 2018, 1:26pm

A thing I’d like to mention here since it definitely does not belong in the proposal-topic, although it might also be a bit tangential to your posts, @sasajuric (but I first started thinking about this during reading your blog article) :

It actually strikes me as weird that there are so many configuration settings that will require an os-application to completely restart. To be perfectly honest, even if parts of the os-application (some of the beam-applications) require configuration to happen at compile-time, shouldn’t we find a way to re-compile them at runtime instead?

sasajuric · May 23, 2018, 3:11pm

The thing is that app env is a mutable storage, but very often values are retrieved only once. You don’t need to restart the entire app to reapply the change setting, but you do need to restart the processes which read the setting.

To make some property truly dynamic, you should avoid “caching” it into a variable. This will in some cases not be enough by itself, and might require much more work. For example, to make an http port dynamically configurable, you need to poll for app env change, then start another server and drain the old one. It’s probably not worth the hassle, and so I’d just suggest proclaiming the http port constant. But if it’s a constant, then why keep it in a mutable store which gives you an illusion of reconfigurability, and which might even be accidentally changed, which could lead to some strange consequences?

That’s why I think that in many cases app env is not the good place to keep settings to begin with. A lot of stuff ends up there just because app env “feels” like the place to keep some parameters. If we keep in mind that app env is just another storage, just like OS env, or external file, or a database, and remember the properties of app env (global, mutable, in-memory), then I think it becomes clearer that it’s not a good place to store every single piece of data, especially not the data which never changes during the system lifetime.

bitwalker · May 23, 2018, 4:04pm

I think a great deal of the abuse of the application env occurs because it is simply easy to do so. It is more difficult to think about how to make your library or application configurable by feeding in parameters to a top-level supervisor or something, than it is to just stuff anything configurable into config.exs and use Application.get_env/3. As long as the easier path is available, there will always be people who take it. Erlang didn’t have quite as much of an issue with this, because it was more of a pain to use the application env for configuration than it was to configure via parameters.

I think a related issue is compile-time configuration. Some of the examples José mentions need configuration inputs at compile-time, and there is no other means to provide them except config.exs, since Mix evaluates it during compilation. Had Mix gone the route of Erlang in regards to application env, the whole discussion would be moot, since config.exs would be limited to runtime configuration. Since it did not, we also have the problem where config.exs has become the catch-all for anything configurable. People are using System.cmd/3 to get a git shorthash in config.exs, rather than doing something like -Dcommit_hash=$(git describe --tags --long) as a flag to the compiler, which is how such things are typically handled in just about every other language I’ve worked with. That isn’t suitable for everything though, and is unwieldy for repeated use. This is definitely a place where config.exs eases a lot of pain - compile-time parameters and global defaults.

I was hoping that by treating config.exs as a runtime config, and having mix.exs handle defining what to include or not include at compile-time, we could get the best of both worlds. It doesn’t really solve the issue of how to prevent people from abusing the app env, but I’m not sure that is something we can really fix at this point, about the only time you could make that kind of change is with a theoretical Elixir 2.0, but its not clear how you could do that without making configuration more inconvenient in general. It seems like a bit of a catch-22 to me.

sasajuric · May 23, 2018, 4:32pm

Yeah, I agree with this 100%

I think it could be improved if onboarding resources promoted plain data passing, and generators such as mix phx.new preferred init/2 over app env.

Some other minor additions, such as the ability to compute a value before compilation (so e.g. mix.env) and fetch it in a module could also help with the trickery which is System.cmd in mix scripts.

So in general I think that some education, pushes from popular generators, and a couple of minor additions from the language and mix could at least guide us all in a better direction.

jeremyjh · May 23, 2018, 6:34pm

I think the config.exs kind of makes sense for configuring modules/applications provided by your dependencies. I may not have any application code that should be the particular owner for those parameters. It may also not make sense for me to manage a server provided by the dependency, just so it can be configurable. People would end up implementing these servers wrong and make their library single-threaded when it has no other reason to be (this already happens sometimes).

OvermindDL1 · May 23, 2018, 10:13pm

These are exactly my complaints, there needs to be a registrateable callback for these (send a message to a process or something on change)…

Yeah, there are a LOT of third-party libraries (and some first-party ones too *coughs*) that put things in the config that should be passed in via supervisor arguments or so…

Indeed. mix.exs should be build-time configuration, arguments should be boot-time, and the application environment should be run-time. If the application environment was not accessible at build time at all then that would go a long way to help (the theoretical Elixir 2.0 though, but perhaps warnings could help for now).

Precisely, everything should be in such an ‘init’, though this would be a great place to fit the JVM style static/overrideable configs into as well.

Except why can’t they specify their own default run-time configurations instead of having to put it in your own config (many third-party libs don’t have good defaults in the code), or be able to inherit it from a dependency that depends on another dependency without you (again) copying it into your own config. The java model handles all this well.

Qqwy · May 24, 2018, 3:22am

I guess they can and should already do so now; this is more a flaw in how those libraries have been written than in the current way we do configuration, I think.

As for changing configuration on the fly, I agree with @OvermindDL1 and @sasajuric that probably the only feasible way would be to call a ‘reconfigure’ function when you want to change something, rather than altering the Application environment (the ‘ground’) under any process’ feet.

@bitwalker What about adding an :extra_compile_args list to mix.exs I completely agree that using System.cmd in the configuration file is very odd.

sasajuric · May 24, 2018, 7:00am

This still seems vague and leaves me puzzled. Why should I set a pubsub name in config.exs but provide e.g. Registry name in my code. Both pieces of data are parameters to my dependencies (i.e. to the code which is not a part of the project).

jeremyjh · May 24, 2018, 11:42am

The difference is you are managing the Registry server, it isn’t started automatically just by including the application in your dependencies. While it makes sense to pass all the parameters to servers you control, my point was just that servers are not the answer to all configuration problems. It would be a worse problem if we forced library authors to write servers to collect and hold compile-time config, which is I think where your proposal leads us. Unless I misunderstand something.

I’m sure you won’t find anyone defending that. Good defaults and usability are important regardless of how we collect configuration data. The JVM system has its failure modes as well, I once had to debug a production system I inherited that had the same properties deployed in three different files as well as the launch command. That isn’t the fault of the JVM, but of poor development and operations practices. Your complaint about library defaults seems similar.

edit: pubsub is actually a bad example for me; I thought it was started separately as its a separate library but I looked into it and its started by the Phoenix endpoint which is started ultimately by the user’s endpoint module in Application.start(), so config could be supplied either to that child or in the Endpoint use macro I believe.

sasajuric · May 24, 2018, 12:06pm

Ah, I see what you mean now.

AFAIK, the pubsub server is also started in my supervision tree. It couldn’t be started otherwise, as it’s ultimately obtained from init/2 of my endpoint, which is also in my supervision tree.

In other words, this is not automatically started when the Phoenix app is starting, and hence that’s not a reason to require placing pubsub name (which AFAICT is not even a registered name, but just some piece of data use internally by Phoenix machinery) into config script or app env. In fact, for my blog, this data is not in app config.

Not sure which proposal you refer to. I’m certainly not suggesting that there’s one-size-fits-all approach to providing data. That’s certainly not app config, and those are also not processes, nor app env, nor any other mechanism of passing data.

sasajuric · May 24, 2018, 12:36pm

Right, or to make a wider point here: if a library is starting a server during its boot, then that server is by default singleton, and the library should provide its name. What’s the point of that name being configurable?

Now, what you probably wanted to point out is that some libraries do require some options during their boot. I definitely agree that providing parameters, such as log level, which is needed during dependency’s (in this case Logger) startup, must be done via config scripts.

However, as we now both agree, that’s not the case for e.g. pubsub name, so it’s a matter of judgement. The reason I’m poking on this property is because Phoenix generator (phx.new) includes it by default in app env, together with a bunch of other stuff which is IMO not configuration, at least not by default.

So I’m still unsure why is pubsub name a configuration, nor how does such configuration gives us loosely coupled components (the claim stated by @michalmuskala)?

Qqwy · May 25, 2018, 5:27pm

@sasajuric is’nt the pubsub name configurable to allow for multiple PubSub providers within the same (os-)application?

sasajuric · May 25, 2018, 6:19pm

The pubsub name is provided by client code, which is of course a good thing, because we can have multiple pubsubs. I’m wondering why this name has to exist in config.exs? What’s the difference between this name, and say a registry name, which we usually provide as a function parameter. Why is pubsub name a “configuration”, and registry name isn’t?

A broader question: what is a configuration? What’s the criteria for providing some parameters via config.exs vs just using functions?

dimitarvp · May 25, 2018, 6:31pm

For all my years of programming in at least 8 languages and no less than 30 frameworks the only ever meaningful answer to this question for me was…

Because it is easier to consolidate in a single file.

Sometimes there’s a secondary answer which is “this is really important to setup on boot and isn’t expected to change” but that’s tangential and not directly related. I would argue that this “configuration” centralization is useful because library defaults could sometimes conflict with your own code artifacts (imagine if you were oblivious of configuration and figured your app needs the exact same hierarchy of options that ecto needs – as a random example).

How would you answer your own question, by the way?

sasajuric · May 25, 2018, 6:43pm

But what exactly is being consolidated? Some pieces of data go to config scripts, while others are provided through regular code. So config scripts do not really contain all the parameters of our system, not even all parameters we pass to external libs.

I have no answer to that question. Config scripts seem quite arbitrary to me. A bunch of unrelated data is stuffed together for the reasons which I can’t fathom.

dimitarvp · May 25, 2018, 6:49pm

I would say “the bizarre artifacts of your current tech” which in the case of Erlang & Elixir would be app names, different environments (dev / test / prod). And I am probably forgetting a few more.

I would venture to say that everything outside of these really should be function parameters or module attributes.

However, in his blog post about mocks Jose gives a really good example for a Twitter client implementation that should vary between environments. But yeah, that depends on the tech’s peculiarities, in this case different environments.

So would you agree that only your tech stack’s specifics should be a subject to the current incarnation of configuration in Elixir?

sasajuric · May 25, 2018, 7:00pm

I’ve argued at lengths in my recent post that IMO most of the data stashed in config scripts doesn’t really belong there. I think that the only viable reason for putting something in a config script is if a library requires it during it’s boot (e.g. Logger config), or during compilation (which I think is a really bad pattern).

I believe that in most other cases we’d be better off if we consolidated the data by their purpose (e.g. provide all endpoint parameters in the endpoint module, and all repo parameters in the repo module).

So I definitely agree that there are legacy reasons for stashing data in config scripts. However, I also think that generators, such as mix phx.new store too much data there, and by extension inspire other developers to overuse config scripts. I feel that this excessive usage of config scripts is the main cause of the problems people experience with Elixir configuration.

dimitarvp · May 25, 2018, 7:08pm

Agreed 100%. I haven’t been responsible for deploying to prod but I observed and listened to people doing it and like you, also found most configurations to be really out of place.

Nailed it. The problem then becomes, should all these modules have mandated locations? Say, lib/boot/<library_boot_module>.ex, maybe? And if it’s not mandated, wouldn’t people just splatter all that information all over the place and make the situation even harder?

EDIT: Just remembered about Rails’ config/initializers/ paradigma. It seems like a decent idea to duplicate in Elixir, don’t you think?

blatyo · May 25, 2018, 7:25pm

What about situations where your config may come from an external source like etcd or a database? Would you then make each of those modules responsible for talking to an external service to get their config? How would you handle a library you didn’t own needing that config?

I feel like this sort of situation warrants a centralized place to get config. Because otherwise you make each of the configurable apps responsible for knowing how to get their configuration.