Not able to login remote pod

Having trouble connecting to a remote node in Kubernetes. Running an Ubuntu image Docker container with Elixir installed. When attempting to connect to the node, encountering errors.

 kubectl exec -it av-soul-5cd76f4dbc-rv7lg -- /bin/bash
nobody@av-soul-5cd76f4dbc-rv7lg:/app$ /app/bin/soulforceauth remote
=INFO REPORT==== 29-May-2024::09:37:15.124583 ===
Can't set long node name!
Please check your configuration

=SUPERVISOR REPORT==== 29-May-2024::09:37:15.125207 ===
    supervisor: {local,net_sup}
    errorContext: start_error
    reason: {'EXIT',nodistribution}
    offender: [{pid,undefined},
               {id,net_kernel},
               {mfargs,{net_kernel,start_link,
                                   [['rem-a129--@',longnames],true,net_sup]}},
               {restart_type,permanent},
               {significant,false},
               {shutdown,2000},
               {child_type,worker}]

{"Kernel pid terminated",application_controller,"{application_start_failure,kernel,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,net_kernel,{'EXIT',nodistribution}}}}},{kernel,start,[normal,[]]}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,kernel,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,net_kernel,{'EXIT',nodistribution}}}}},{ker

Crash dump is being written to: erl_crash.dump...done

Please let me know how it can be resolved

Dockerfile used

# Find eligible builder and runner images on Docker Hub. We use Ubuntu/Debian
# instead of Alpine to avoid DNS resolution issues in production.
#
# https://hub.docker.com/r/hexpm/elixir/tags?page=1&name=ubuntu
# https://hub.docker.com/_/ubuntu?tab=tags
#
# This file is based on these images:
#
#   - https://hub.docker.com/r/hexpm/elixir/tags - for the build image
#   - https://hub.docker.com/_/debian?tab=tags&page=1&name=bullseye-20210902-slim - for the release image
#   - https://pkgs.org/ - resource for finding needed packages
#   - Ex: hexpm/elixir:1.14.0-erlang-24.0.5-debian-bullseye-20210902-slim
#
ARG ELIXIR_VERSION=1.14.0
ARG OTP_VERSION=24.0.5
ARG DEBIAN_VERSION=bullseye-20210902-slim

ARG BUILDER_IMAGE="hexpm/elixir:${ELIXIR_VERSION}-erlang-${OTP_VERSION}-debian-${DEBIAN_VERSION}"
ARG RUNNER_IMAGE="debian:${DEBIAN_VERSION}"

FROM ${BUILDER_IMAGE} as builder

# install build dependencies
RUN apt-get update -y && apt-get install -y build-essential git \
    && apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

# prepare build dir
WORKDIR /app

# install hex + rebar
RUN mix local.hex --force && \
    mix local.rebar --force

# set build ENV
ENV MIX_ENV="prod"

# install mix dependencies
COPY mix.exs mix.lock ./
RUN mix deps.get --only $MIX_ENV
RUN mkdir config

# copy compile-time config files before we compile dependencies
# to ensure any relevant config change will trigger the dependencies
# to be re-compiled.
COPY config/config.exs config/${MIX_ENV}.exs config/
RUN mix deps.compile

COPY priv priv

COPY lib lib

COPY assets assets

# compile assets
RUN mix assets.deploy

# Compile the release
RUN mix compile

# Changes to config/runtime.exs don't require recompiling the code
COPY config/runtime.exs config/

COPY rel rel
RUN mix release

# start a new build stage so that the final image will only contain
# the compiled release and other runtime necessities
FROM ${RUNNER_IMAGE}

RUN apt-get update -y && \
  apt-get install -y libstdc++6 openssl libncurses5 locales ca-certificates \
  && apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

# Set the locale
RUN sed -i '/en_US.UTF-8/s/^# //g' /etc/locale.gen && locale-gen

ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8

# Install Chrome
RUN apt-get update && \
    apt-get install -y wget gnupg && \
    wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - && \
    echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list && \
    apt-get update && \
    apt-get install -y google-chrome-stable && \
    rm -rf /var/lib/apt/lists/*

WORKDIR /app
RUN chown nobody /app

# set runner ENV
ENV MIX_ENV="prod"

# Only copy the final release from the build stage
COPY --from=builder --chown=nobody:root /app/_build/${MIX_ENV}/rel/soulforceauth ./

USER nobody

CMD /app/bin/soulforceauth eval "Soulforceauth.Release.migrate()" && /app/bin/soulforceauth start

hey mate! i feel like it’s related to your container hostname not being FQDN perhaps?
did you tried this maybe?

What are the contents of your rel/env.sh.eex file and these env values inside the node?

echo $RELEASE_NAME
echo $RELEASE_NODE
1 Like

Starting with OTP-25 (or maybe later) it seems that you need to specify node name when doing a release if you want to be able to connect to node, here is an example: rel/env.sh.eex · main · SSL MOON / SSL MOON · GitLab

2 Likes

Thanks to everyone! I’ve identified the issue, which was in the env.sh.eex file. It’s now resolved. Thanks for guiding me in the right direction!

1 Like

Could you please post the solution here, especially if it’s something specific to distributions running in Kubernetes? To help others who are searching the same problem in the future.

2 Likes

Issue # 1 Unable to log in to the remote pod.
A file named env.sh.eex was created during attempt to deploy it under fly.io. This file, although unnecessary for Kubernetes deployment, contains the following content:

#!/bin/sh # configure node for distributed erlang with IPV6 support export ERL_AFLAGS="-proto_dist inet6_tcp" export ECTO_IPV6="true" export DNS_CLUSTER_QUERY="${FLY_APP_NAME}.internal" export RELEASE_DISTRIBUTION="name" export RELEASE_NODE="${FLY_APP_NAME}-${FLY_IMAGE_REF##*-}@${FLY_PRIVATE_IP}"

These environment variables are crucial for configuring the Erlang runtime and the application, which resulted in the issue of being unable to login to the remote pod when we tried deploying using kubernetes.
When i removed that file and initiated deployment, it was working fine.

2 Likes