What is the difference between using {:system, "PORT"} and System.get_env("PORT") in deployment?

This is mostly the case where people want to read shell environment variables. There is an obscure reason for doing so: cloud platform providers such as Heroku do force users to do so.

And example is a PORT environment variable that has to be read from shell, in order to configure cowboy to bind to proper port. If you don’t do that, no requests will arrive to your application at all. The same issue is with database connection URL and many addons require that you read your config from shell ENV and configure at runtime rather than compile time. It’s not ideal but the way it is.

2 Likes

I understand that part of the problem, what I don’t understand is why you need to add a special hack to solve a problem which is easy to solve anyway with the existing system. Starting the system is very versatile and it is easy to specify multiple config files or args files, of which vm.args is an example, at start up time. These files are easy to generate and so as to read shell variables.

That is the bit I don’t understand.

2 Likes

Elixir works perfectly fine with erlang config files. The problem is that they differ significantly from what people are used to when it comes to deployments. People expect it to “just work™” in an ad-hoc way - that’s how most systems they’ve used worked, and generating a file on the fly isn’t what I’d call “just working”.

1 Like

This doesn’t work for development/testing/ci, unless you are building releases for those environments. Furthermore, it doesn’t compose well: if I have 3 applications in an umbrella, I would like to place each config close to the application, instead of having to remember to specify the configuration later on when building the release. And even if you edit vm.args, you still can’t pass the system variables on appups or relups.

Plus, some systems have their configuration in .json or other syntax that are written by their deployment system, how would you suggest them to read and parse those config files in vm.args?

Which part of reading configuration for a process in its init callback is a hack? Because the currently recommended approach is simply that. Or are you referring to the old {:system, ...} tuple?

1 Like

At first glance, I have to admit that I’m not sure I like this approach.

I feel that supporting something like Phoenix.start_link(endpoint_plug_module, other_params) would be a much better choice. The benefit of this is that it’s explicit, very flexible, and you can make it code-reload friendly by wrapping the invocation inside your own module. Also, it’s consistent to how other types of processes are started, so it doesn’t introduce a special case.

But the most important benefit of that approach is that I feel it sets a good example. Every now and then, I come across a library that requires me to set the configuration, where a simple parameter passed to start_link would suffice. I feel that, at least to some extent, this style is promoted by Phoenix and Ecto, both of which mandate a config setting and a dedicated module.

Notice that I’m not against using app env in “leaf apps”, i.e. our own custom projects that depend on Phoenix, Ecto, and others. But I do feel that the vast majority of libs shouldn’t require app envs at all. The only exception that comes to mind is Logger, which is a global/singleton kind of library, so I guess having it require configuration is fine in this case.

2 Likes

You cannot make it code reload friendly with the API above. The endpoint is a supervisor and we need to redefine the supervision tree in case the user wants to start listening to a new port or change any stateful configuration. And in order to do so with code reloading, it is only possible if you do so inside init. Any value you pass through start_link will be permanently sticked and not code reloadable.

If you take a step back and ask: where is the best way to configure a supervisor so I can change its children under code reloading, the answer is going to be on init. This is very aligned with the child_spec discussion happening on elixir-lang-core right now.

I believe though it is still worth debating if we should rely less on the application environment, although the need of per environment configuration is almost a given for Ecto and Phoenix. Calling Ecto.start_link(Repo) instead of Repo.start_link() can also be an improvement, so we stop injecting start_link everywhere, but will likely become irrelevant if the child_spec proposal is accepted, since nobody will call Repo.start_link directly anymore and then we will be free to move it elsewhere.

1 Like

I’m not very familiar with code reloading, so maybe I’m missing something, or I have too naive thinking. But I’ll try anyway :slight_smile: Let’s say that I have something like:

defmodule MyEndpoint do
  def start_link, do: Phoenix.Endpoint.start_link(...)
end

Wouldn’t it be enough to produce appup file which would reload the MyEndpoint module, and then instruct the parent supervisor to restart the child?

I definitely think we should rely less on the app env. One thing bothering me is that if I want to start an endpoint during tests, I need to setup some app envs. Even Phoenix itself is hurting from this problem, because it has to manually configure a test endpoint. I had to do the same in our Elixir client for Phoenix channels (see here). This is clumsy, and it’s not immediately obvious how am I supposed to do it. Being able to do something like Phoenix.start_link(...) would be much nicer.

I’m not sure why do we need user’s repo module at all? Why can’t we just Ecto.Repo.start_link(...)?

When it comes to Phoenix, users still needs to provide the endpoint module, but that module could be a pure plug in most cases (one exception being code reloading, in which case you might introduce start_link as I argued above).

Another benefit of this approach is that you can easily run dynamic number of instances. For example, I can start multiple repos as instructed by end-user (we’re actually building such kind of a system at my company). When it comes to endpoint, I could have a single template endpoint plug module, and dynamically start/stop actual web servers depending on the user interaction. These cases are arguably more exotic, but they showcase how dropping app env gives us more flexibility.

Again, just to be clear, as an end-user of Phoenix or Ecto, I would likely use app env to configure my endpoints/repos. However, I do believe that libraries themselves should not enforce that usage.

2 Likes

The database adapter does some code generation in the repo. There are also some named ets tables in the repo (called after the repo module) used for the query cache. There is work being done to remove those limitations, but it’s not there yet.

3 Likes

What I wanted to say is that I feel that whatever is currently accomplished with MyRepo should be achievable without that module. Not really familiar with internals of Ecto, so perhaps I’m missing some fine-print details.

Happy to see that some work is already being done here :thumbsup:

2 Likes

No because changing how the supervisor is started does not change any supervisor that is currently running. Similar to a GenServer, if I upgrade the GenServer module, any GenServer currently running will have the old state and need to go through code_change. The code_change callback for supervisor is also init.

I believe this would be easily achievable on both. But it still means any compile time option needs to be given elsewhere, like on use Phoenix.Endpoint, with the remaining options given to start_link. Putting them on the environment allows you to not care if it is compile or runtime.

1 Like

Maybe I’m missing something here, or we’re talking past each other. In my proposal, the parent of the endpoint knows nothing about the configuration. It only knows that the endpoint is started via MyEndpoint.start_link/0. So if you reload the MyEndpoint module, and then tell its parent to restart the child, I believe that should work properly, since the new version of MyEndpoint.start_link/0 will be invoked, and that new version will pass the changed endpoint options.

I’m assuming here that we need to restart the endpoint if the listening port is changed. My understanding is that this is true for your proposal as well. Or maybe you envision a scenario where port can be somehow dynamically reconfigured without taking the endpoint down?

But it still means any compile time option needs to be given elsewhere

What compile time options are there? Preferably, we should have as few of such as possible, because they are not really flexible.

2 Likes

Can you write some pseudo code showing the supervision tree starting from the application callback? Because as far as I understand in your proposal, if you have this code:

# In the application tree
worker(MyEndpoint, [])

# In MyEndpoint
def start_link do
  Phoenix.start_link(__MODULE__, [http: [port: System.get_env(...)]])
end

Then you can only reload the options if you force start_link to be invoked again, which implies terminating the whole Phoenix.Endpoint supervision tree and starting a new one.

However, if the configuration is moved inside init, then we never need to bring the whole Phoenix.Endpoint supervision tree down, we can reload the tree without bringing any undesired process down, as the granularity is inside its own init. In other words, as you move the configuration up, the harsher the reload will have to be.

I am arguing that the best place to load dynamic configuration in any OTP service is inside init. And that’s what we want to push as best practice.

1 Like

Oh, that’s precisely what I thought - we need to restart the entire endpoint to apply the changed network port.

I guess that you’re suggesting that you can do better by just restarting a part of the endpoint subtree. That’s fine, but I still wonder what does it mean in practice? Can you get away by not restarting cowboy processes? If not, then the question is which processes would in fact survive a network port change.

I don’t think it’s as simple as that. It’s a tradeoff because that approach makes default usage more complex, and my feeling is that most people are not using code reloading anyway. So now we end up with a new, and IMO more complicated mechanism of configuring to support a less likely case of “I want to change the listening port and restart as few processes as possible”. Notice that changing endpoint options is the only case we’re discussing here. If you want to upgrade the endpoint plug, a router, a controller, or a view, AFAICT it should work normally.

Of course, the case you mention is still valid, so it would be nice if Phoenix would make it possible. Having a callback like you suggest, maybe renamed to dynamic_configure or something like that, would be nice. But I wouldn’t want to see this as the primary method of configuration, because it’s more complex and rigid, and imposes some constraints (e.g. it’s hard to start a dynamic number of endpoints then). Another variant is to accept a configuration callback module as the (optional) parameter. If provided, Phoenix would always invoke that module to get the option value.

I guess my general point is that we should consider introducing special solutions and recommendations for people that want to do fine-grained code reloading, while keeping the straightforward usage for default cases (which also covers some variants of code reloading).

3 Likes

I disagree with this because we already have a simple and general purpose mechanism that works on all services. init is available in GenServer, Supervisor, GenStage, Phoenix, Ecto, etc. Why introduce something especial if we already have something consistent and simple that works everywhere?

Furthermore, I don’t even buy the proposed approach is simpler: what is the difference in asking developers to implement MyEndpoint.start_link/0 instead of implementing MyEndpoint.init/1?

1 Like

Sorry, I missed this earlier. They are very few on both Ecto and Phoenix side. They are documented in Ecto.Repo and Phoenix.Endpoint modules (and we are working on reducing them).

2 Likes

I think we’re dealing with multiple issues here so my main goal is kind of diluted. AFAIK we actually don’t have a simple mechanism for Phoenix (or Ecto), because options are driven by hardcoded app env settings. I’d like to see that disappear :slight_smile:

You’ve pointed out that this has some problems with code-reloading, so we’ve now diverged to some more complex corners (which are nonetheless still important). Regardles of that, my point still stands: I’d like to see pure Phoenix.Endpoint.start_link and Ecto.Repo.start_link which take options as function parameters, even if that approach won’t work for (all) code-reloading cases.

MyEndpoint.start_link is not my main proposal. It was an idea of how I could still keep some support for code reloading with pure Phoenix.Endpoint.start_link. Those who don’t want code reloading don’t need to write their own modules, and can use Phoenix.Endpoint.start_link directly. This is IMO as simple and as straightforward as it gets, and it will be good enough for most cases.

I’m not saying that MyEndpoint is the best idea. Perhaps, as I said in the previous message, passing a late-binding configuration module as an (optional) parameter would work better. Or perhaps the callback, such as the one you propose would work better. In fact, maybe the callback which you propose is good enough for all cases as long as that callback is always invoked, instead of being configured through config.exs.

Either way, one deficiency of the current design and the new proposal is the lack of support for dynamic endpoints. Can I dynamically spin off endpoints or Ecto repos without needing to define a dedicated module? Currently the answer is no (unless I compile a module at runtime), and this limitation remains with the new proposal. It’s arguably a less likely case, but IMO so is the case of “I want to change the listening port and restart as few processes as possible” :slight_smile: I think that we could support both cases, but not with the current proposal.

2 Likes

I am afraid that if you look at problems in isolation, then you can come up with specific solutions to those problems, but once you look at all problems together, the individual solutions fall apart or become considerably harder to implement.

For example, adding Phoenix.Endpoint.start_link goes against the compile-time compilation and it promotes bad practices in terms of code upgrades, so there is little interest in going in this direction. Once we add the fact 99% of the Phoenix applications rely on multiple environments, reading from the environment as a default is the best choice. The cost of having to write to the application environment when running tests is a small indirection compared to the benefits.

The dynamic endpoint discussion though is somewhat orthogonal to this and I agree with you. Both Ecto and Phoenix should allow starting the endpoint multiple times, especially after we allow dynamic configuration to happen on init. It is mostly a matter of removing references based on the mod, which is static, and make them be based on the name. Ideally we will continue to store the defaults in app env with dynamic values in init.

1 Like

This is why I was asking about compile-time options. Is there something that really requires compile-time configuration? And if there is, does it necessarily imply how we deal with all parameters?

I never argued against that in the end-user application. Phoenix project generator could still generate the code along the lines of Phoenix.Endpoint.start_link(Application.fetch_env!(:my_app, :endpoint_config)).

We’re still retaining the same set of features and properties for end-users, but Phoenix becomes more flexible than it currently is.

Another important benefit, as I initially mentioned, is that by doing this, Phoenix would promote what I believe is a good practice for libraries: take your options as parameters, not as config values under some magical key. The latter is more confusing (at least for me), and less flexible.

I fail to see those benefits :slight_smile: Arguably the only one is that the approach you propose makes it easier to reconfigure the port and restart as few processes as possible. Such feature could still be achieved with pure Phoenix.Endpoint.start_link and a callback which is always invoked.

I guess after this discussion, I think that what actually bugs me is the fact that I need to configure whether the callback is invoked. If the callback was always invoked (and app env wasn’t mandated by the Phoenix lib), then I’d be fine with it.

So maybe something along the lines of Phoenix.Endpoint.start_link(callback_mod, arg) would also work. Again, the Phoenix project generator could generate the init/1 callback definition which would by default read options from app env. How do you feel about that idea?

If options always have to be provided through config.exs, it’s still going to be confusing. I really dislike that, and don’t see the need for it.

2 Likes

That’s exactly my point. If we are going to do that 99% of the time, I would rather do it by default.

You say Phoenix becomes more flexible by explicitly reading the environment but it really doesn’t, it is the exact same. If you disagree, can you explicitly tell me what you would be able to achieve by moving the configuration out that we can’t today? You say it is more flexible, so please try to provide concrete examples.

Otherwise the only change I see is that it no longer requires writing to put_env in 1% of the cases (like tests and single-file projects) while forcing everyone else to read it from the app environment. And just to clarify, it is not this change that makes dynamic endpoints possible. As I said in the previous message, dynamic endpoints is orthogonal to where you put the default configuration.

Except, as I have argued extensively on this thread, you don’t want pass your options as parameters, but rather define or read them from init, especially if you rely on multiple environments and code loading.

So from my perspective: the application environment is the best place for configuration given the features a Phoenix application requires. If your concern is about libraries copying what Phoenix does even if they don’t have the same requirements as Phoenix, then we should promote better education and not have Phoenix stop using those features correctly. Otherwise we could make the same argument for meta-programming: Phoenix should not use meta-programming, even if it uses it correctly, because other developers will use it in cases they don’t need to.

If you still think passing those options through start_link is the best way to go, we will have to agree to disagree.

Yes, unfortunately init/1 is already taken by Plug so we had to resort to the callback approach. If you have better options in mind, we will be glad to hear them, but it needs to keep today’s semantics of being called on the new process with semantics similar to init. Promoting the use of options through start_link is something we want to discourage for all of the reasons mentioned on this thread, even if you don’t seen to value those reasons the same way that we do (which is fine).

1 Like

Sorry, one last amendment:

The make it explicit, the benefits are:

  • We push everyone to work with multiple environments (without forcing all applications to read from the application environment)

  • We provide a unified place where we can specify configuration that is required at compile-time and configuration that is needed only at runtime

  • Provides good practices when it comes to hot code reloading

Because I feel that we are walking in circles, besides providing examples of where you think your approach is more flexible, please try to provide some examples on how you would configure a Phoenix endpoint considering at least the first two bullets above: compile-time configuration and multiple environments. Your solution of reading configuration on init works only for the latter. You can find a list of compile-time options on the Phoenix.Endpoint docs.

1 Like