Releases and Dynamic Nodes

amnu3387 · June 20, 2019, 12:14pm

So I’ve been trying the new Elixir 1.9 releases in order to move from using Distillery & edeliver (nothing wrong with any of those though).
One thing I’m now struggling with is defining Nodes to connect to. In Erlang what I see used is making a config file, that is static in nature. I’ve seen some examples starting the repl, and passing -config dir/name to it to load that config file.

On the Elixir release wrapper, there seems to be some ways to hook up into the boot process just before it starts the app. So I was imagining I could do something like

defmodule TestProvider do
  @behaviour Config.Provider
  
  def init(_), do: nil
  def load(config, nil) do
    
    :inets.start
    {:ok, {_, _, resp} = some_response = :httpc.request('http://www.erlang.org')
    
    Config.Reader.merge(
      config,
      web: [
        some_value: resp
      ]
    )
  end
end

This does set config for the app web -> some_value to resp and works correctly. But what format should I use to set something like the following Erlang tuple?

[
  {kernel, [
      {distributed, [{web, 5000, [a@host, {b@host, c@host}]}]},
      {sync_nodes_mandatory, [b@host, c@host]},
      {sync_nodes_timeout, 30000}
    ]
  }
].

(shamelessly copied from learnyousomeerlang.com)

Or if that doesn’t work / isn’t possible, how could I write this to a file to be used by the release on the boot after the config dry-run? Where should I write this file to?
Can I then pass -config some_path to the app boot through env.sh and set there

export ELIXIR_ERL_OPTIONS="-config some_path"

?
Thanks

amnu3387 · June 20, 2019, 4:24pm

Ok, so this works:

Umbrella mix.exs

releases: [
        bundle: [
          applications: [
            web: :permanent
          ],
          include_executables_for: [:unix],
          config_providers: [
            {TestProvider, nil}
          ],
          start_distribution_during_config: true
       ]
]

(:inets has to be added as an :extra_application to the being released mix file as well)

test_provider.exs:

defmodule TestProvider do
  @behaviour Config.Provider
  
  def init(_), do: nil
  def load(config, nil) do
    
    :inets.start
    {:ok, {_, _, etcwtv} = :httpc.request('http://somewhere.com')
    other_nodes = extract_nodes(etcwt) #somehow
    Config.Reader.merge(
      config,
      kernel: [
        distributed: [{:web, 5000, [Node.self() | other_nodes]}],
        sync_nodes_mandatory: other_nodes,
        sync_nodes_timeout: 30_000
      ]
    )
  end

  def extract_nodes(resp) do
      #....
  end
end

So they connect, I have another shell running for test purposes. So the reason why I wanted them to connect on boot instead of manually connecting them, is because I have a few gen_servers that are globally registered, so I wanted to prevent re-registering them and having to solve the naming conflict. This is in an app that is a dependency of the main app running (it’s in an umbrella if that matters).

It has the following start_link/1 and is started from that dependency supervision tree

def start_link(queue) do
    Logger.info("Starting Blind.Queue GenServer")
    case GenServer.whereis(@gen_ref_val) do
      nil ->
        {:ok, pid} = GenServer.start_link(__MODULE__, queue)
        case :global.re_register_name(__MODULE__, pid, &solve_conflict/3) do
          :yes ->
            {:ok, pid}
          :no ->
            :ignore
        end
      _ ->
        :ignore
    end
end

But when the release boots, it always re-registers (meaning it couldn’t find the gen_server with that name) and the solve_conflict/3 fun always runs.

Am I wrong in thinking that forcing the nodes to boot up distributed would ensure the global name table to be set and shared between the synched nodes before the app&dependencies were started?

Does this mean I need to redesign the way the app is booting? How can I ensure that dependencies don’t start their supervision trees before I set up the nodes?

Any pointers are very welcomed

amnu3387 · June 21, 2019, 11:42am

Continuing the monologue in case someone stumbles on this in the future (and why not).
So indeed setting just the node connections on boot isn’t enough as the global table isn’t immediately shared, but calling for instance :global.sync() (or any other action that forces the nodes to communicate) on the supervisor or just before those particular servers start does indeed synch them and then the whereis works as expected.

Nonetheless, after toying with it, which was educational, I figured the best way is to not start those applications by default (since I’m not setting takeovers nor anything like that), and instead add another “application” which sole responsibility is to properly connect to the nodes, and synch the tables, and only afterwards start the real applications.

To do that it’s needed to specify on the release settings that you don’t want them to start (by default they’re set to :permanent), and you need to do it for all applications (that have a supervision tree) included in the release and dependencies as well. In my case it’s in an umbrella but should be the same in a regular project.

releases: [
  bundle: [
     applications: [
        ae_bootstrapper: :permanent,
        web: :load,
        cache_server: :load,
        database: :load
     ],
     include_executables_for: [:unix],
  ]
]

So I added ae_bootstrapper as an application on the release, set to :permanent. The app being released is web while cache_server and database are dependencies in web, but since the global processes exist in cache_server that app needs to be set to :load as well so that its supervision tree isn’t started on boot.
Then on the bootstrapper tree it starts a genserver, that just connects the nodes, synchs and ensure_all_started for :web with :permanent and the whole thing starts and everything works as expected.