Runtime configuration library (with casting, validation etc...) for native releases

rbino · February 24, 2020, 11:27am

Hi everybody!
We’re in the process of moving a project consisting of several Elixir applications from Distillery 1.x and Conform to Elixir’s 1.9+ native releases.

The applications are configured at runtime with environment variables (they’re deployed with Kubernetes) and one of the handy things the previous setup was giving us was the ability of describing configuration in a declarative way, marking variables as required, providing custom casting/validation etc.

Validation and casting is clearly possible writing custom code in releases.exs, but it tends to clutter the configuration, so I was looking for a library or something else to help me with this task.

Searching around here on the forum, I’ve come across these:

Specify by @Qqwy
Skogsrå by @alexdesousa
A series of Tweets by @sasajuric describing a yet-to-be-opensourced library.

Is there any other libary I can add to the above list and/or I’m missing some usage patterns that can make me happy just using releases.exs? Do you have any other advice on this matter?

Thanks in advance

keathley · February 24, 2020, 12:08pm

We’re using Vapor extensively at my work: https://github.com/keathley/vapor/.

The README probably needs to be touched up a bit. We’ve added a lot of ergonomics (specifying translations with a config value, grouping providers, etc.) that aren’t well documented atm.

sasajuric · February 24, 2020, 1:08pm

Yeah, we’ll opensource that lib, though I’m not exactly sure when. I’ll try to make it happen soon-ish

As a bit of a background, when I wrote that lib, I took a glance at vapor (I wasn’t aware of skogsrå at the time, and I only learned of specify in your post ). Perhaps it was just the lack of docs, but my impression was that for the stuff we wanted, we’d have to implement most of functionality ourselves. Since my clients currently only use OS env, the conclusion was that vapor would basically be used as a glorified System.env wrapper, so we didn’t end up using it.

As demoed in those tweets, here are some of the things we wanted:

The ability to automatically create operator templates (.env files) automatically. This is an absolute must for us, because without it producing the list of required & optional vars is extremely error prone.
Providing sane defaults in dev/test allows us to make the projects runnable out of the box. As long as you’re e.g. running postgresql on the default port, the code will work without any special configuration. In particular, we don’t need to maintain a set of separate .env files or use envrc (which was also error prone, leading to frequent situations where project works for one dev, but doesn’t for others).
Compile-time generation of access functions makes it possible to detect spelling errors (e.g. db_poll_size) during compilation.
Included typespecs improve reasoning about the code.

All this being said, I feel that we don’t need such fragmentation, and that it would be better if we somehow rallied behind a single implementation. If other authors are open to this idea, I’m up for discussing it further.

But in any case, I feel we definitely need some solutions which make operator configuration simple, so in that sense, I guess it’s better to have four libs instead of zero

keathley · February 24, 2020, 1:41pm

More likely these features just didn’t exist yet. Vapor was pretty bare bones. We’ve added more ergonomics based on feedback from the other engineers at B/R. But a lot of those haven’t been documented in the README yet.

Vapor allows you to provide defaults for a binding. But we typically recommend people don’t do that unless they have to. Defaults are only for convenience and will eventually lead to bugs in production. I was initially of the opinion that Vapor shouldn’t support defaults at all but was eventually overruled. In order to support a nice dev experience, we check in the .env files to our repos. Vapor’s dotenv provider supports a hierarchy of files and one of those is .env.local. This is where developers will overwrite values or add tokens, etc. We ensure *.local files are gitignored so they don’t get checked in.

keathley · February 24, 2020, 1:50pm

I meant to reply to this above. I’m not sure if I have strong feelings on the fragmentation. I will say that the likelihood of B/R moving to a different library is pretty slim. It’s already a lot of work to get all of our services converted to use vapor. I don’t think we’ll change the way we do configuration again just for kicks ;). That also means that I’ll be supporting Vapor for a while though.

alexdesousa · February 24, 2020, 2:03pm

I’ve developed Skogsrå a while ago, because we’ve found configurations could get messy in the long run when you have many releases. Right now, I’m using it extensively in:

My personal project.
Some of the open source libraries I maintain.
The company I work for at the moment.

Though the startup I’ve developed this for doesn’t exist anymore, I kept using the library and adding the features I needed. The ones I use the most are the ones described here

Anyway, regarding the following:

I agree with @sasajuric. We (the maintainers of all those libraries) could join forces and actually develop an Elixir built-in solution. Our experiences configuring production systems all this years might come handy when developing an unified solution.

However, I also agree with this. This gives us a diverse ecosystem.

hauleth · February 24, 2020, 5:44pm

I still want to implement Dhall for Erlang so I would be able to use it with my Elixir projects. Other than that I just use sys.config or handle everything “manually” in Application.start/2 callback.

sasajuric · February 24, 2020, 10:21pm

The way we approached this is by allowing the following spec:

{:db_name, dev: "my_db_dev", test: "my_db_test"}

This is interpreted as follows:

in dev mode, the default is my_db_dev
in test mode, the default is my_db_test
both defaults can be overridden via os env
there’s no default in prod - the app will fail to start if os env is not provided

I believe that this is a good trade-off between convenience and production safety. Dev defaults are a part of the codebase and we don’t need to keep .env files in the git repo. This reduced a recurring issue of a dev forgetting to update a .env file because the thing worked on their machine, and also removed a bit of duplication.

audaxion · February 24, 2020, 11:15pm

I’ve found confex pretty pleasant to use.

keathley · February 25, 2020, 12:24am

If that’s working for you then that seems fine. We’re pretty happy avoiding default values in code like that wherever possible and default to always loading from the same source regardless of the “mode” the app is in. We just try to avoid a distinction between “dev”, “test”, and “prod” wherever possible. We’ve found it less confusing when things go wrong and accommodates a wider range of developer workflows. But those are our preferences. They probably aren’t going to be correct for everyone.

chulkilee · February 25, 2020, 12:28am

I haven’t tried any tools yet (just maintaining releases.exs) - just waiting for the good patterns with built-in mix release

One thing I’d like to ask: what kind of validation besides type check (e.g. parsing to integer) do you want to put in config-read time not inside the app init?

I know there are some “bad config” we don’t want to even let the whole apps start… but doesn’t it make duplicate work? Shouldn’t we leverage apps and supervision trees instead?

Also I’m curious how these tools work well with Config.Provider - or whether we need some changes in Config.Provider.

Another idea: I’m wondering we can define common configuration spec in each app module, and expose back to config tools, not vice versa. By doing that, we can enforce the config is actually being passed (e.g. avoiding an error that I add config at config tool but not using it app…)

keathley · February 25, 2020, 12:48am

I’ve seen people do this with Vapor. The module will define a group of providers. All of the providers are composed together and loaded on application start.

defmodule Process do
  alias Vapor.Provider.{Group, Env}
  def config do
    %Group{
      name: :process_config,
      providers: [
        %Env{bindings: [port: "PROCESS_PORT"]},
      ]
    }
  end
end

defmodule Application do
  def start(_, _) do
    config = Vapor.load!([
      Process.config(),
    ])
   children = [
      {Process, config.process_config},
    ]

    Supervisor.init(children, startegy: :one_for_one)
  end
end

Qqwy · February 25, 2020, 10:58am

This is a very interesting discussion!

I do not see the current landscape as overly fragmented. It seems like the different solutions have different design goals behind them, and exploring all of those is valuable.

Nevertheless, what I think is most valuable are discussions like these, in which we can compare approaches.

Specify was created based on a description of Vapor, when Vapor was still vaporware (either during last year’s ElixirConf.EU or on one of the Elixir Outlaws episiodes). It took the idea of a layered stack of configuration providers from Vapor.
However, the main and more important idea behind Specify is to make it explicit what keys (having which types) you are expecting your configuration to define. Based on the configuration specification it will:

Automatically validate/parse the values passed in, raising errors when values are malformed.
Add a description of all the configuration fields to the documentation of the module it is defined in.
Raise errors when required fields are missing (i.e. are not defined anywhere in your configuration stack).

As such, I think that’s quite close to the “expose back to config tools, not vice versa” that @chulkilee is asking about.

Qqwy · February 25, 2020, 4:29pm

By the way, another key idea behind Specify was to be explicit in what is read from where. It is meant to be able to be used by libraries just as much as applications, which was based on a discussion on this forum two years ago about structuring configuration (Rethinking app env - #20 by blatyo).
The key idea is that a library can specify clearly what values (and types of values) are expected, as well as default ways to structure the configuration-layering. People using the library can then override this default configuration-layering (as well as the values passed at any of those layers).
The configuration-layer that always takes the most precedence is the one where we pass in values explicitly when calling YourConfigModule.load(explicit_values: %{...}, _other, _options) or YourConfigModule.load_explicit(..., _options), to allow for easy testability or per-location defaults (rather than only per-process or only globally).

@sasajuric I wonder about the choice of your new to-be-released configuration library to use functions to retrieve the configuration for two reasons:

This does not allow ‘local’ overrides to the configuration. Of course, one might argue that the kinds of settings that need local overrides should not be stored in (this kind of) configuration at all. However, deciding whether that is the case for some value is then e.g. placed on the burden of a library designer, unless your library is not intended to be used by other libraries but only by top-level applications.
It hides the internal details of fetching the configuration. What if fetching configuration is slow but we write code where we call one of the configuration functions many times per second? This is probably the kind of stuff that works fine in development but might break in production, where configuration might be fetched from other locations. And what if race conditions happen where we fetch two related configuration-values, but between the two reads the configuration source is updated, so we end up with one ‘old’ and one ‘new’ value?

Nevertheless, I love the idea of having clear specs (and therefore code-completion-suggestions and potentially some type-checks or other compile-time-checks) and would love to pick your brain on if there are ways to combine this while leaving the two properties I mentioned intact.

@chulkilee

I think this depends on what you are configuring.

wrong field names being used in the code that consumes the configuration could be catched at compile-time.
missing configuration should prevent the piece of code that requires that configuration to be run. If your whole application depends on something, that should prevent it from starting. If only part of it depends on it, you can still use the rest (just using normal supervision techniques; processes that require something fetch that during their startup).
Although rare, sometimes it makes sense to reload configuration at runtime (i.e. alter the configuration independently from your app’s release cycle). We do get close to the deeper question of ‘when can we call something configuration’?

I think this is also covered by my reply above .

This is a good question. Specify predates the new Mix release changes, and I have not had time to look into how these changes might enable special interoptability.

A related question to ask, however, is in which way the two should interact:
Should we see Mix releases as a single configuration source, or should we instead see Specify/Vapor/Saša’s new tool as a source for Mix’s way to do configuration?
I currently lean towards the former, because Mix’s way of doing configuration is more rigid than what these tools can provide, but I am very interested in the opinions of @keathley and @sasajuric and anyone else on this matter.

sasajuric · February 25, 2020, 5:03pm

I’m certain that we don’t support all the scenarios. Lib was designed specifically for the limited (but very real) set of problems which my clients face. So for example, we currently don’t support nil at all. You can provide an optional value but you have to give it a default which is not nil. I’m sure this is not enough, and I spent a bit of time thinking it through, but I don’t yet have a clear view on how to tackle it, and we don’t need it so far, so I’m still letting it simmer

Not sure what you mean, but we do allow local dev/test defaults as I said before:

sasajuric:

The way we approached this is by allowing the following spec:
{:db_name, dev: "my_db_dev", test: "my_db_test"}
This is interpreted as follows:

in dev mode, the default is my_db_dev

in test mode, the default is my_db_test

both defaults can be overridden via os env

there’s no default in prod - the app will fail to start if os env is not provided

Is that what you have in mind, or are you talking about some other scenarios?

This is currently indeed not tackled, simply b/c we’re only using OS env, so we don’t care about it. However, there is some basic plumbing in place to make it work.

First, beyond individual getters, we also inject fetch_all which returns the complete map. This allows clients to cache the stuff however they want to. For example, you could retrieve this during app startup and pass it through the supervision tree. Alternatively, you might want to store it in some ets table, or even app config (which is after all an ets table ).

Admittedly, with fetch_all you lose compilation guarantees, but we still get typespecs, so that’s still something at least.

It’s also worth mentioning that fetch_all paves way for merging configurations from different sources. Basically, you can define multiple config modules, fetch all from them, and then perform a map merge.

It’s not perfect though, b/c you currently have to copy-paste the definition. I’ve deliberately coupled config definition with the adapter, b/c that’s the simplest interface that fits my client’s needs, but I’m aware that it’s not very flexible, so I’m definitely open to expanding it, though I’d like to hear the exact use cases first.

Another plumbing for faster retrieval is in the source adapter contract which looks as follows:

@callback values([Provider.param_name()]) :: [Provider.value()]

So the generic code asks the source to fetch all the params at once, which means that the adapter can make a single round trip to the underlying source, such as an external database.

We might add some internal caching logic too, but I want to defer that until the need arises. The benefit of internal caching is that we could keep functions as getters (so compile-time guarantees), and still have fast access. OTOH, caching brings a lot of complexity, so I’m a bit cautious about it.

sasajuric · February 25, 2020, 5:08pm

My current opinion on this matter is that I avoid doing anything in releases.exs if I can help it. The reason is that this is a free-form code which runs only in production, which means that it’s not easily (if at all) testable on CI. I’m not confident with writing such code, and so I like to keep as much of config as possible outside of it. Providing config during app boot has a few shortcomings, but on the plus side it brings production and dev/test much closer to each other, and so I strive to have as much config as I can in the app start.

Qqwy · February 25, 2020, 5:50pm

Thank you for your detailed response!

I mean: What happens if you want to use something twice in your application (or even twice within a single process), configured differently? Use cases of these are for instance:

Any library that depends on configuration but you might want to use two or more times in your application.
Single tests wanting to run in a slightly different environment from other tests (for instance to test how the app responds being configured in certain ways).
Abstract datatypes where we want to configure what concrete implementation of that datatype is being used for some reason (like performance vs. memory tradeoffs) in one place of our code differently from other place(s).

sasajuric · February 25, 2020, 6:01pm

Then I believe you need two config values. E.g. if you want to run two Phoenix endpoints, say main site and admin, then you need e.g. public_http_port and admin_http_port, right?

I’d first attack this by having the code under test accept config as function parameters, and then provide different parameters from the corresponding tests.

This is something I’m not considering for the foreseeable future. In my view, config provisioning should be about fetching atomic values (connection params, auth tokens, pool sizes, logger leves & such). Building arbitrary complex data representation is IMO usually best left to the app code. Perhaps I’m wrong, but I definitely want to start lightweight and expand from that. That said, I’d like to hear more about concrete use cases you have in mind.

I should mention that our provider internally uses ecto changesets for type conversion, so custom complex types and arbitrary variations might be possible via ecto types. TBH I didn’t even explore this, we just use changesets because they are convenient for converting types and reporting errors

keathley · February 25, 2020, 6:34pm

These are exactly my feelings. Our individual solutions might look slightly different. But this is the core problem.

Exadra37 · February 25, 2020, 8:18pm

Checking a .env file into git is a bad security practice. The .env should always be in the .gitignore file. Not doing so can lead to leak sensitive information in plain text to the version control system. Even if you are in control of such system, and it’s private, you should not leak the .env file outside the server running your code.

Instead you should use the .env.local with the sane defaults for development, but I would prefer to be explicit in it’s name and call it .env.dev.

Then the README for the project could have the instructions to copy the .env.dev file to .env, or if you have a setup script for development you add this step there.