Exploring Elixir E7: Automatic Clustering

In this week’s episode we look at the basics of how clustering Elixir nodes works, and how to automate their creation using libcluster (using UDP Gossip, but also supports Kubernetes, EC2 and DNS service discover) as well as how to replace epmd itself.

An Extra episode will be available tomorrow or the next day as well covering some updates to last week’s episode, some musings on the current limitations of BEAM clustering (security, scalability), and a look at how the show’s git repo is being organized.

I hope you all enjoy! :slight_smile:

35 Likes

Great episode @aseigo!

2 Likes

Thanks, just happy when people get something from them as well! :slight_smile:

1 Like

This is a great quick screencast thank you.

Exactly what I needed, hopefully I will find the time soon to add libcluster to my tokumei project generator. Then quickstart can be even smoother.

2 Likes

@aseigo, tnx for the screencasts and all the expertise you’ve shared in various communities.

  • A true KDE Software and Elixir fan
4 Likes

Great work Arron, your screencast foo is powerful. Keep up the great work!

1 Like

Amazing screencast! However I cloned the repo and can’t get it to work on development machine.
Starting libcluster on first node isn’t an issue, the heartbeat starts to appear, but when I try to run it on a second node it raises an error:

{:failed_to_start_child, Cluster.Strategy.Gossip, {{:badmatch, {:error, :eaddrinuse}}, [...] ...}

Seems there is some process already running in the required address/port and libcluster can’t start.

1 Like

Hm… which operating system, and what are the exact steps you are taking that lead to this?

Or … do you have some other process bound to port 45892 (and probably needs to be a UDP connection …)? You can change the default port used by libcluster by editing config/libcluster.exs and adding port: <something other number> to the exploring_elixir topology section.

1 Like

Thanks for replying!

Running on:
MacOS Sierra 10.12.6
erlang/OTP 20
elixir 1.5.2
libcluster 2.2.3

Just cloned the repo and installed deps (nothing changed) and after successfully starting the first node with iex --sname node-1 -S mix and ExploringElixir.AutoCluster.start(), if i run lsof -i :45892 this shows up:

COMMAND      PID      USER   FD   TYPE   DEVICE    SIZE/OFF   NODE NAME
beam.smp     12427     me    35u  IPv4   0x1879...    0t0     UDP *:45892

Trying to start the second node iex --sname node-2 -S mix and ExploringElixir.AutoCluster.start() shows the error I previously mentioned:

** (Mix) Could not start application libcluster: Cluster.App.start(:normal, []) returned an error: shutdown: failed to start child: Cluster.Strategy.Gossip
** (EXIT) an exception was raised:
    ** (MatchError) no match of right hand side value: {:error, :eaddrinuse}
        (libcluster) lib/strategy/gossip.ex:56: Cluster.Strategy.Gossip.init/1
        (stdlib) gen_server.erl:365: :gen_server.init_it/2
        (stdlib) gen_server.erl:333: :gen_server.init_it/6
        (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3

Changing the port: <some_other_number> didn’t fix the issue, I even tried to set the port to 0 so a random and available one is picked, in this way the second node can successfully start but it wont be able to automatically connect to the first or any other node.

This is my config/libcluster.exs:

config :libcluster,
  topologies: [
    exploring_elixir: [
      strategy: Cluster.Strategy.Gossip,
      config: [
        port: 45892 # tried other ports, not solving it    
      ]
      # everything else...
    ]
  ]

Maybe I’m just not getting the

if_addr: {0,0,0,0},
multicast_addr: {230,1,1,251}

configuration part, or some other setup is required to run multiple nodes in dev mode.

As far as I know a UDP socket can’t share the same address/port with other process or be restarted:

iex(1)> :gen_udp.open(8789)
{:ok, #Port<0.1174>}
# on the same or another shell ->
iex(2)> :gen_udp.open(8789)
{:error, :eaddrinuse}

but in your screencast it seems to work fine, so I’m confused haha.

1 Like

UDP ports can be open by multiple processes if in broadcast mode … but this requires binding to an interface that supports UDP broadcasting. I don’t have a Mac to test this on, but it apparently can require a bit of config … see: https://github.com/gossiperl/gossiperl/wiki/multicast-overlays#setting-up-multicast-on-os-x

Would be interested to know if that resolves it for you …

1 Like

Unfortunately that didn’t help.
According to this:


since MacOS 10.10.5, two unbound (wildcard) UDP sockets can not share the same port anymore.
So, for my project I’ll use the hardcoded epmd strategy only for development, I prefer spending more time on my app’s issues than dealing with my computer’s configuration.
Thanks a lot for your help and the amazing screencast! keep them coming! :slight_smile:

1 Like

Hey I was wondering if anyone found a solution to this. I’ve just wasted a full day on this too so I’ll probably use the epmd strategy too for dev. Thanks for sharing this though, I would have still been scratching my head!

1 Like

Use Kubernetes instead? Or DNS, or AWS … the issue with EPMD is you have to distribute the cluster configuration, and so imho any solution that allows you to make that information discoverable, rather than explicitly distributed pre-boot as configuration, is a bonus.

I would also expect it would be pretty straight forward to add other simple backends to libcluster for things like mDNS / zeroconf …

For playing around with clusters it is ok, if still annoying, to do it “by hand” with epmd configuration … but for production I just can’t imagine it anymore :slight_smile:

1 Like

@aseigo Your videos are top-notch. will you create more screencast in future ? Maybe basics of OTP and GenStage videos!

2 Likes

Great post, thaks for that. Is there any way to clone the code of this example?

Thanks in advance.