I deployed my Phoenix application using Docker and I’m not able to access the elixir shell of the application. Below is how it looks like:
$ bin/app_name remote
Could not connect to "app_name"
Is there any reason why this happens?
I deployed my Phoenix application using Docker and I’m not able to access the elixir shell of the application. Below is how it looks like:
$ bin/app_name remote
Could not connect to "app_name"
Is there any reason why this happens?
There’s so many ways to deploy a Phoenix application using Docker that you will have to give us more information about it.
Generally the error says there’s no connection. This means that the remote shell does not know how or cannot reach the pod that is running the actual application. This is going to be a networking and discovery configuration so that the container in which you start remote console has to know where to find the container running Phoenix server…
If you are doing that via Kubernetes/GKE please let me know. I have a blog post 80% finished describing how to do precisely the above but I lacked motivation to wrap it up
@hubertlepicki I deployed the app with the standard Dockerfile generated by running mix phx.gen.release --docker
using Kamal. Here’s the entire Dockerfile:
# Find eligible builder and runner images on Docker Hub. We use Ubuntu/Debian
# instead of Alpine to avoid DNS resolution issues in production.
#
# https://hub.docker.com/r/hexpm/elixir/tags?page=1&name=ubuntu
# https://hub.docker.com/_/ubuntu?tab=tags
#
# This file is based on these images:
#
# - https://hub.docker.com/r/hexpm/elixir/tags - for the build image
# - https://hub.docker.com/_/debian?tab=tags&page=1&name=bullseye-20240904-slim - for the release image
# - https://pkgs.org/ - resource for finding needed packages
# - Ex: hexpm/elixir:1.17.2-erlang-27.0.1-debian-bullseye-20240904-slim
#
ARG ELIXIR_VERSION=1.17.2
ARG OTP_VERSION=27.0.1
ARG DEBIAN_VERSION=bullseye-20240904-slim
ARG BUILDER_IMAGE="hexpm/elixir:${ELIXIR_VERSION}-erlang-${OTP_VERSION}-debian-${DEBIAN_VERSION}"
ARG RUNNER_IMAGE="debian:${DEBIAN_VERSION}"
FROM ${BUILDER_IMAGE} as builder
# install build dependencies
RUN apt-get update -y && apt-get install -y build-essential git \
&& apt-get clean && rm -f /var/lib/apt/lists/*_*
# prepare build dir
WORKDIR /app
# install hex + rebar
RUN mix local.hex --force && \
mix local.rebar --force
# set build ENV
ENV MIX_ENV="prod"
ENV ERL_FLAGS="+JPperf true"
# install mix dependencies
COPY mix.exs mix.lock ./
RUN mix deps.get --only $MIX_ENV
RUN mkdir config
# copy compile-time config files before we compile dependencies
# to ensure any relevant config change will trigger the dependencies
# to be re-compiled.
COPY config/config.exs config/${MIX_ENV}.exs config/
RUN mix deps.compile
COPY priv priv
COPY lib lib
COPY assets assets
# compile assets
RUN mix assets.deploy
# Compile the release
RUN mix compile
# Changes to config/runtime.exs don't require recompiling the code
COPY config/runtime.exs config/
COPY rel rel
RUN mix release
# start a new build stage so that the final image will only contain
# the compiled release and other runtime necessities
FROM ${RUNNER_IMAGE}
RUN apt-get update -y && \
apt-get install -y libstdc++6 openssl libncurses5 locales ca-certificates \
&& apt-get clean && rm -f /var/lib/apt/lists/*_*
# Set the locale
RUN sed -i '/en_US.UTF-8/s/^# //g' /etc/locale.gen && locale-gen
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8
WORKDIR "/app"
RUN chown nobody /app
# set runner ENV
ENV MIX_ENV="prod"
# Only copy the final release from the build stage
COPY --from=builder --chown=nobody:root /app/_build/${MIX_ENV}/rel/app_name ./
USER nobody
# If using an environment that doesn't automatically reap zombie processes, it is
# advised to add an init process such as tini via `apt-get install`
# above and adding an entrypoint. See https://github.com/krallin/tini for details
# ENTRYPOINT ["/tini", "--"]
EXPOSE 4000
CMD ["sh", "-c", "bin/app_name eval AppName.Release.migrate && bin/app_name start"]
I’m not so sure how it’s not working. The app is deployed and I’m able to access the container using docked exec -it container-id bin/bash
on my server. I’m actually trying within the container to access the Elixir shell using the command bin/app_name
oh. I don’t know how Kamal does this stuff
I cannot say for sure, but one of the problems might be how your node is named. Here you can take a look at release config that works: Blame · rel/env.sh.eex · main · SSL MOON / SSL MOON · GitLab
I still didn’t find the fix but the issue seems to be occuring because Kamal sets the hostname
flag on the docker container to you server IP followed by a hash.
I believe this causes the started iex shell to be unable to connect to the server since they are on two different network or something like this.
I will try to debug some more tomorrow and see if I can find a solution
Alright, that was quite the ride but I think I finally figured it out.
adding this to rel/env.sh.eex
fixed it for me
#!/bin/bash
export RELEASE_DISTRIBUTION=name
Note that I am still new to the Elixir/BEAM ecosystem so I might be wrong on some things. If something doesn’t make sense, please correct me.
The bin/<app-name> remote
command connects to your running Phoenix app using a remote shell. As far as I understand, this starts a new BEAM node and connects to your existing Phoenix node using the magic of BEAM. This is the same thing that you would do to connect two BEAM nodes together in a distributed cluster. The only difference is that this is all happening on the same host.
The way the BEAM discovers how to connect to nodes is with epmd (Erlang Port Mapper Daemon). When a BEAM node starts it registers itself with epmd using it’s name. By default, this is the name of your Phoenix app when using Phoenix releases. If you then want to connect remotely to the node, you can use one of two formats to discover it, name
or sname
.
sname
is used by default. When using sname
, you can just pass the name of the app to epmd and it will try to look for that node on you local machine. The format for node names is name@host
but when using sname
, the @host
portion is implicit since it is always the localhost.
name
on the other hand is used when you also want to connect to a BEAM node on local and remote machine. It takes the format of name@host
where @host
is the domain name or ip address of the other machine that holds the node you want to connect to.
Now what I believe happens is that when you use sname
, it seems to default to using the top level domain of whatever hostname
is and uses that as the implicit @host
part. In most cases, this is fine. But if the hostname
is set to an ip address, epmd will take what it thinks is the top level domain and take the first 1-3 digits of the ip address. So for example, if you have a hostname
of 123.456.789
, epmd
will take 123
as the “top level domain” and use that as the implicit @host
. (ex: app@123
)
The problem this causes is that when epmd tries to resolve the 123
host, it leads to no where. Resolving number-only domains results in weird behaviours most of the time. And so epmd just can’t find the nodes on localhost since it tries to look for node on a host that doesn’t exist.
Now, Kamal sets the --hostname
flag on the docker container to the IP of the server/role that you specified in deploy.yml
followed by what seems to be a random hash, ex: 123.456.789-hgjkagh
. This causes the problem explained above and makes epmd unable to find your localhost nodes.
So now here is the solution, when RELEASE_DISTRIBUTION
is set to name
, the default @host
(In Phoenix release at least) will be the full hostname
and not just the top level domain part. This works because when you pass the --hostname
flag to Docker, it will add an entry to /etc/hosts
pointing that hostname to the IP of the docker container on the network. This enables epmd
to resolve that hostname to you current docker container, and is then able to find the node to connect to.
Hope this helps, if you know more about this topic and believe that I made a wrong assumption, please let me know, I would be very interested to learn.
Isn’t that exactly the same thing to what I pointed to?
Not really. This works when the --hostname
flag doesn’t start with numbers followed by a .
. This works when you don’t set a hostname, or when the hostname contains numbers and letters.
This would work: --hostname sslmoon
;
This would also work --hostname 123sslmoon
;
But this would’t work: --hostname 123.sslmoon
;
And this also would’t work: --hostname 123.456.789-sslmoon
(which is essentially what Kamal gives us).
You can try it by starting the docker container locally and specifying the hostname
Ah NVM, I see it now. Glad you figured it out!