Distributed Elixir in Amazon ECS

docker
deployment
#1

Did anyone successfully set up ECS using distributed Elixir? I am tackling this right now.

I am especially interested how to connect the nodes. I was thinking it might make sense to write a custom libcluster strategy that fetches the current ip’s connected to the Application Load Balancer. ECS also offers service discovery using DNS, so that might also be a solution (this creates an A record for each node) and there seems to be a DNS strategy already in libcluster.

As far as I know these are the things to be addressed:

  • Setting the node name to: <APPNAME>@<PRIVATE IP>
  • Exposing the Port Mapper Daemon port in the VPC and docker image (4369)
  • Exposing the intra-erlang communication ports (configurable using inet_dist_listen_min, inet_dist_listen_max) in the VPC and the docker image
  • Service discovery (setting up communication between nodes) using a libcluster strategy

Setting the hostname to include the private IP can be done by using curl http://169.254.169.254/latest/meta-data/local-ipv4, not sure yet how to best inject this as a env variable.

Perhaps anyone already has figured some of this out on ECS/EC2? Help really appreciated! Planning to document the results so it’s easier for other to get this up and running. I already have some experience building a very small docker image using releases and multi-stage builds.

2 Likes

#2

You have hit on the major points. For the hostname, I can offer a concrete example. In the script that starts my application:

# Permit OS variable substitution for starting the VM
export REPLACE_OS_VARS=true

# Get the EC2 fully-qualified hostname for the node name
export PUBLIC_HOSTNAME=`curl -s http://169.254.169.254/latest/meta-data/public-hostname`

Then in rel/vm.args you can set the node name using the env var:

-name <%= release_name %>@${PUBLIC_HOSTNAME}
3 Likes

#3

https://elixirforum.com/t/what-would-you-like-to-see-in-a-book-about-elixir-deployments-on-aws-ecs/

1 Like

#4

We have a few apps running on Fargate ECS. Basically, we build bare images with only the erlang relase build with distillery inside.
In latest setup we use a simple sh script to get internal ip

export NODE_NAME="$1"; \
  export NODE_IP; \
  NODE_IP=$(/sbin/ip route|awk '/scope/ { print $9 }'); \
  exec /opt/app/$1/bin/$1 foreground

and in vm.args we set

-name ${NODE_NAME}@${NODE_IP}`
-setcookie ${COOKIE}

We query Route 53 SRV records to discover other nodes and connect them on internal network (no need to expose EPMD ports). We use peerage lib with custom strategy for that.

0 Likes

#5

Great! That is in line with what I wanted to do.

So you don’t need to expose the ports in the VPC if it’s internal communication? (AWS noob I know)

Also do you expose the erlang communcation ports in docker (I see you don’t specify the port range in vm.args)

0 Likes

#6

Btw, now ECS support DNS service discovery out of the box.

https://aws.amazon.com/ru/about-aws/whats-new/2018/03/introducing-service-discovery-for-amazon-ecs/

I just used https://github.com/bitwalker/libcluster/blob/master/lib/strategy/dns_poll.ex and it works well.

My vm.args:

-name <%= release_name %>@${PUBLIC_HOSTNAME}
-setcookie <%= release.profile.cookie %>
-kernel inet_dist_listen_min 9000
-kernel inet_dist_listen_max 9010

My start.sh:

#!/bin/sh

export REPLACE_OS_VARS=true
export PUBLIC_HOSTNAME=`curl http://169.254.170.2/v2/metadata | jq -r ".Containers[0].Networks[0].IPv4Addresses[0]"`

echo "Hostname: $PUBLIC_HOSTNAME"

REPLACE_OS_VARS=true /app/release/bin/start_server foreground

The DNS name can be taken from service details page. Usually it is my-service-name.local. Use it as query for Cluster.Strategy.DNSPoll strategy

3 Likes

#7

Yep that’s exactly what we are using as well now. BTW the erlang communication port can be a single port.

1 Like

#8

Hey guys, thanks for all the info, it helped me a lot set it up myself.

You’re all running one node per instance, with a fixed port mapping from container to host, right?

AWS doesn’t allow A records in the service registry unless the networkMode of the container is awsvpc (where each container has its own elastic network interface and IP address), I suppose because there could be multiple instances per host, running on different ports. It has to be SRV, which is fine.

The DNSPoll strategy only polls A records by default, so I’m curious about whether you guys had to use a custom resolver like I did, or if there’s a more straightforward way?

0 Likes

#9

I use Fargate, so it uses the A record for service discovery.

0 Likes

#10

Our tasks are using awsvpc network mode. It has some limits depending on EC2 instance type https://docs.aws.amazon.com/en_us/AWSEC2/latest/UserGuide/using-eni.html . With Fargate you don’t need to think about this limitation

0 Likes