Distributed Elixir in Amazon ECS

Did anyone successfully set up ECS using distributed Elixir? I am tackling this right now.

I am especially interested how to connect the nodes. I was thinking it might make sense to write a custom libcluster strategy that fetches the current ip’s connected to the Application Load Balancer. ECS also offers service discovery using DNS, so that might also be a solution (this creates an A record for each node) and there seems to be a DNS strategy already in libcluster.

As far as I know these are the things to be addressed:

  • Setting the node name to: <APPNAME>@<PRIVATE IP>
  • Exposing the Port Mapper Daemon port in the VPC and docker image (4369)
  • Exposing the intra-erlang communication ports (configurable using inet_dist_listen_min, inet_dist_listen_max) in the VPC and the docker image
  • Service discovery (setting up communication between nodes) using a libcluster strategy

Setting the hostname to include the private IP can be done by using curl http://169.254.169.254/latest/meta-data/local-ipv4, not sure yet how to best inject this as a env variable.

Perhaps anyone already has figured some of this out on ECS/EC2? Help really appreciated! Planning to document the results so it’s easier for other to get this up and running. I already have some experience building a very small docker image using releases and multi-stage builds.

6 Likes

You have hit on the major points. For the hostname, I can offer a concrete example. In the script that starts my application:

# Permit OS variable substitution for starting the VM
export REPLACE_OS_VARS=true

# Get the EC2 fully-qualified hostname for the node name
export PUBLIC_HOSTNAME=`curl -s http://169.254.169.254/latest/meta-data/public-hostname`

Then in rel/vm.args you can set the node name using the env var:

-name <%= release_name %>@${PUBLIC_HOSTNAME}
4 Likes

https://elixirforum.com/t/what-would-you-like-to-see-in-a-book-about-elixir-deployments-on-aws-ecs/

2 Likes

We have a few apps running on Fargate ECS. Basically, we build bare images with only the erlang relase build with distillery inside.
In latest setup we use a simple sh script to get internal ip

export NODE_NAME="$1"; \
  export NODE_IP; \
  NODE_IP=$(/sbin/ip route|awk '/scope/ { print $9 }'); \
  exec /opt/app/$1/bin/$1 foreground

and in vm.args we set

-name ${NODE_NAME}@${NODE_IP}`
-setcookie ${COOKIE}

We query Route 53 SRV records to discover other nodes and connect them on internal network (no need to expose EPMD ports). We use peerage lib with custom strategy for that.

1 Like

Great! That is in line with what I wanted to do.

So you don’t need to expose the ports in the VPC if it’s internal communication? (AWS noob I know)

Also do you expose the erlang communcation ports in docker (I see you don’t specify the port range in vm.args)

Btw, now ECS support DNS service discovery out of the box.

https://aws.amazon.com/ru/about-aws/whats-new/2018/03/introducing-service-discovery-for-amazon-ecs/

I just used https://github.com/bitwalker/libcluster/blob/master/lib/strategy/dns_poll.ex and it works well.

My vm.args:

-name <%= release_name %>@${PUBLIC_HOSTNAME}
-setcookie <%= release.profile.cookie %>
-kernel inet_dist_listen_min 9000
-kernel inet_dist_listen_max 9010

My start.sh:

#!/bin/sh

export REPLACE_OS_VARS=true
export PUBLIC_HOSTNAME=`curl http://169.254.170.2/v2/metadata | jq -r ".Containers[0].Networks[0].IPv4Addresses[0]"`

echo "Hostname: $PUBLIC_HOSTNAME"

REPLACE_OS_VARS=true /app/release/bin/start_server foreground

The DNS name can be taken from service details page. Usually it is my-service-name.local. Use it as query for Cluster.Strategy.DNSPoll strategy

11 Likes

Yep that’s exactly what we are using as well now. BTW the erlang communication port can be a single port.

1 Like

Hey guys, thanks for all the info, it helped me a lot set it up myself.

You’re all running one node per instance, with a fixed port mapping from container to host, right?

AWS doesn’t allow A records in the service registry unless the networkMode of the container is awsvpc (where each container has its own elastic network interface and IP address), I suppose because there could be multiple instances per host, running on different ports. It has to be SRV, which is fine.

The DNSPoll strategy only polls A records by default, so I’m curious about whether you guys had to use a custom resolver like I did, or if there’s a more straightforward way?

I use Fargate, so it uses the A record for service discovery.

1 Like

Our tasks are using awsvpc network mode. It has some limits depending on EC2 instance type https://docs.aws.amazon.com/en_us/AWSEC2/latest/UserGuide/using-eni.html . With Fargate you don’t need to think about this limitation

2 Likes

We use a custom Peerage.Provider implementation like

defmodule AwsServiceDiscovery do
  @behaviour Peerage.Provider

  @impl true
  def poll do
    dns_name = Application.fetch_env!(:peerage, :dns_name)
    app_name = Application.fetch_env!(:peerage, :app_name)

    :inet_res.lookup(String.to_charlist(dns_name), :in, :srv)
    |> Enum.flat_map(fn {_priority, _weight, _port, srv_dns_name} ->
      :inet_res.lookup(srv_dns_name, :in, :a)
      |> Enum.map(fn ip ->
        :"#{app_name}@#{:inet.ntoa(ip) |> to_string}"
      end)
    end)
  end
end
2 Likes

I am using the bridge network mode, with the custom peerage provider for srv records.

 NAME                RESULT OF ATTEMPT
 api@172.31.13.73 true      
 api@172.31.6.233 false     

 LIVE NODES
 api@172.31.13.73 (self)

Not really sure why it won’t connect. Any help is appreciated.

I struggled with this for a week and eventually got it working. The information in this thread is enough to point in the right direction, but there is a big gap between the right direction and a successful deployment.

I’ve written up a fresh guide on deploying distributed Elixir to ECS Fargate tasks with Service Discovery and libcluster’s DNS Polling strategy. Hope it helps save someone some time!

4 Likes

Thanks for this write-up. However, note that Route53 multi-value DNS will only ever return 8 records maximum. This limits the number of nodes in a cluster using this strategy to 8. Any more than that, and the results returned by the dns query will vary each time, causing libcluster to think nodes are coming and going, and you will see lots of disconnects/reconnects.

There is also a possible issue with the task IPs getting removed from the private namespace zone only after a task has transitioned to the STOPPED state (which seems like a major design flaw), not giving a task being shut down a chance to drain any pending requests and account for the A record TTL. I have not fully investigated this, however.

So after some investigation, it does look like the A records get removed immediately with the stop requests. It also appears that the application does not get a SIGTERM for quite a while after that – the container remains active all the way through to the end of the DEPROVISIONING state.

The limitation with the number of A records being limited to 8 stands, however. More than that and the cluster will experience lots of random disconnects and reconnects. It is possible that things may eventually stabilze (seems like it should in theory) – I am still looking at this.