How to start a cluster of nodes with Phoenix and Libcluster

samtechie · March 21, 2021, 6:25am

Am totally new to Libcluster but I want to start two phoenix app instances on my local machine in dev mode. So this is what I have done so far based on the libcluster documentation.

This is my application.ex.

def start(_type, _args) do
  topologies = Application.get_env(:libcluster, :topologies) || []
  children = [
   # start libcluster
   {Cluster.Supervisor, [topologies, [name: MyApp.ClusterSupervisor]]},
   # ..other children..

And my config/dev.exs

config :libcluster,
topologies: [
  example: [
    # The selected clustering strategy. Required.
    strategy: Cluster.Strategy.Epmd,
    # Configuration for the provided strategy. Optional.
    config: [hosts: [:"a@127.0.0.1", :"b@127.0.0.1"]],
    # The function to use for connecting nodes. The node
    # name will be appended to the argument list. Optional
    connect: {:net_kernel, :connect_node, []},
    # The function to use for disconnecting nodes. The node
    # name will be appended to the argument list. Optional
    disconnect: {:erlang, :disconnect_node, []},
    # The function to use for listing nodes.
    # This function must return a list of node names. Optional
    list_nodes: {:erlang, :nodes, [:connected]},
  ]
]

My first question is what is the example key above? can it be renamed to anything? Should it be the application name?. Secondly, how would I start this in my shell. I tried

PORT=4000 elixir --name a@127.0.0.1 -S mix phx.server  & PORT=4001 elixir --name 
b@127.0.0.1 -S mix phx.server

It just got a warning message unable to connect to :"b@127.0.0.1" and the normal phoenix app was started. I have no idea how to do this properly. Any ideas or guidance are welcome.

jarimatti · March 21, 2021, 7:42am

Quick note on the warning message: it is normal during startup, if the other node is not up yet. You can check if the nodes see each other with e.g. Node.list(): it shows the nodes this node is connected to.

When the nodes are connected, node b should see node a:

iex(b@127.0.0.1)3> Node.list()
[:"a@127.0.0.1"]

I find it helpful to start two nodes with interactive shells while developing. In this case you could do PORT=4000 iex --name a@127.0.0.1 -S mix phx.server in one terminal and PORT=4001 iex --name b@127.0.0.1 -S mix phx.server in another.

jarimatti · March 21, 2021, 8:14am

The docs are not clear on this one so take this with a grain of salt: the example is a name for the topology. You can have multiple topologies in a single application with same or different strategies. I’m not sure if this is very useful in practice, but it’s possible. I usually put application name there.

The code is in the Cluster.Supervisor: libcluster/lib/supervisor.ex at a07ca5605e5cfad3da880ebda1ab6dbd5f635539 · bitwalker/libcluster · GitHub

By default all nodes in an Elixir/Erlang cluster are connected, so when there are multiple topologies it’s still a single cluster.

samtechie · March 21, 2021, 8:26am

Thanks I tried this PORT=4000 iex --name a@127.0.0.1 -S mix phx.server but am now getting the error.

[warn] [libcluster:MyApp] unable to connect to :"a@127.0.0.1"
[error] Failed to start Ranch listener MyApp.Endpoint.HTTP in :ranch_tcp:listen([cacerts: :..., 
key: :..., cert: :..., port: 4000]) for reason :eaddrinuse (address already in use)

I am not sure what is wrong because there’s no application using the port.

antoine · March 21, 2021, 8:35am

In your first post in the command , you used one & so the first instance is running in back ground using the port 4000.

Try to type fg in your terminal , you will go back to the corresponding iex.

crova · March 21, 2021, 8:47am

In case you ever want to fiddle with multiple topologies and don’t want the default behavior of all nodes connecting to everyone else automatically, you’ll want to pass --erl "-connect_all false" when starting your nodes.

Regarding the port being used, make sure that there is no hardcoded “4000” in any of your config. Otherwise it won’t grab the value you’re passing from the command line.

egze · March 21, 2021, 10:19am

Look here for example Demo of distributed Elixir with libcluster and DNS

samtechie · March 24, 2021, 6:28am

Thanks turns out the problem was the hardcode “4000” in the config.