Node name and cookie in mix.release inside Docker container

mhmtarif · September 17, 2020, 3:04pm

Hi All
I create mix release inside a Docker container and run the container in production. But i cannot connect to the running node via remote Shell.

Inside the running Docker container when i run ’ bin/my_app eval “IO.puts(”#{node()}")" ’ i get node@nohost.

Similarly when i run eval Node.get_cookie i get nocookie. I already set release_distrubution RELEASE_COOKIE and Release_cookie environtment variables. But it seems like release does not read them.

I also created .erlang.cookie file ınside the home directory but nothing changes

When i try to restart the running release with "bin/my_app restart ". i got --rpc-eval : RPC failed with reason :nodedown error.

I will appreciate any help.
Thank you

Kind regards

lovyou · September 17, 2020, 5:25pm

The most important environment in this case is RELEASE_DISTRIBUTION because it defines in which mode a node is started. However, if it is untouched (that’s what I expect in your case), it is set to name.

May I ask you to run the following code on your node:

System.get_env()
|> Enum.map(fn {k, v} -> "#{k}=#{v}" end)
|> Enum.filter(&String.starts_with?(&1, "RELEASE_"))

I expect to see the following result (not all values must be the same):

["RELEASE_BOOT_SCRIPT_CLEAN=start_clean",
 "RELEASE_ROOT=.../_build/dev/rel/test",
 "RELEASE_SYS_CONFIG=.../_build/dev/rel/test/releases/0.1.0/sys",
 "RELEASE_VSN=0.1.0", "RELEASE_DISTRIBUTION=sname",
 "RELEASE_COOKIE=PGL67P4KHY62MQBXIOTTJ3OTHQB52LEHKW57WRXUH5ZQDWIS42CQ====",
 "RELEASE_VM_ARGS=.../_build/dev/rel/test/releases/0.1.0/vm.args",
 "RELEASE_BOOT_SCRIPT=start",
 "RELEASE_TMP=.../_build/dev/rel/test/tmp",
 "RELEASE_COMMAND=start_iex", "RELEASE_MODE=embedded", "RELEASE_NAME=test",
 "RELEASE_NODE=test"]

mhmtarif · September 17, 2020, 5:59pm

Hi,
Thank you for the reply,

When i run that code i got the following:

[“RELEASE_BOOT_SCRIPT=start”,
“RELEASE_BOOT_SCRIPT_CLEAN=start_clean”,
“RELEASE_COMMAND=eval”,
“RELEASE_COOKIE=j1DD2056I9sPS8c1Sv3mFISd-w14e5QNrKUkNejIIYrJAv980deA==”,
“RELEASE_DISTRIBUTION=name”,
“RELEASE_MODE=embedded”,
“RELEASE_NAME=my_app”,
“RELEASE_NODE=my_app@10.40.1.120”,
“RELEASE_ROOT=/root”,
“RELEASE_SYS_CONFIG=/root/releases/0.1.0/sys”,
“RELEASE_TMP=/root/tmp”,
“RELEASE_VM_ARGS=/root/releases/0.1.0/vm.args”,
“RELEASE_VSN=0.1.0”]

mhmtarif · September 17, 2020, 6:00pm

But when i run:

./myapp eval "node() |> IO.puts " i got the following
nonode@nohost

when i run:
./my_app eval "Node.get_cookie |> IO.puts " i got the following
nocookie

when i run
./my_app restart i get:
–rpc-eval : RPC failed with reason :nodedown

lovyou · September 17, 2020, 6:06pm

But if you want to run command on the existing node, you should use rpc instead of eval. Please check the following description:

% ./_build/dev/rel/test/bin/test
Usage: test COMMAND [ARGS]

The known commands are:

    start          Starts the system
    start_iex      Starts the system with IEx attached
    daemon         Starts the system as a daemon
    daemon_iex     Starts the system as a daemon with IEx attached
    eval "EXPR"    Executes the given expression on a new, non-booted system
    rpc "EXPR"     Executes the given expression remotely on the running system
    remote         Connects to the running system via a remote shell
    restart        Restarts the running system via a remote command
    stop           Stops the running system via a remote command
    pid            Prints the operating system PID of the running system via a remote command
    version        Prints the release name and version to be booted

As you can see eval starts a new instance and run the given command and rpc connects to the existing node and run the given command on it.

mhmtarif · September 17, 2020, 6:10pm

When i run

~/bin # ./my_app rpc "Node.get_cookie |> IO.puts "
i get:
–rpc-eval : RPC failed with reason :nodedown

mhmtarif · September 17, 2020, 6:17pm

I want to connect to the running system from other nodes.

even inside the docker container when i run:

~/bin # ./my_app remote
Erlang/OTP 22 [erts-10.7.2.3] [source] [64-bit] [smp:6:6] [ds:6:6:10] [async-threads:1]

Could not contact remote node my_app@10.40.1.120, reason: :nodedown. Aborting…

mhmtarif · September 17, 2020, 6:56pm

There is a related topic here:

i still dont have a solution though.

mhmtarif · September 17, 2020, 7:00pm

My server is Ubuntu as well

mhmtarif · September 17, 2020, 7:13pm

i guess this is related as well?

lovyou · September 17, 2020, 10:39pm

If you ask me, it looks like your server is just down. It is really hard to help you unless you provide some code that allow us to reproduce the issue.

Generally, the thing that bothers me the most is that you run this in docker so the primary process must be presented, otherwise your container would be killed. Can you tell us how do you start your container and application inside it?

mhmtarif · September 17, 2020, 10:48pm

No the server is not down. it is running in production. Only problem is i cannot connect the running node to each other.

i start it with bin/my_app start

lovyou · September 17, 2020, 11:08pm

Okay, then it’s must be up and running because this command starts your application in the foreground.

Can you run the following commands and paste results back?

$ ./erts-*/bin/epmd -names
$ ping -c3 -w3 10.40.1.120
$ nc -zv 10.40.1.120 4369; echo $?
$ nc -zv 10.40.1.120 <port_returned_by_epmd>; echo $?

chulkilee · September 18, 2020, 3:31am

What a coincidence, I was writing a guide on this subject - for docker and k8s (with libcluster).

Here, I used “cluster_demo” image with phx + mix release.

First of all, to use automatic DNS with FQDN, you should use docker network. Without it, you cannot access a container from another container with container name.

Otherwise, you have to use IP address inside RELEASE_NODE - which is doable, but needs wrapper (either docker, or env.sh) and other containers need to know the ip address (which does not stay!)

Here are examples of using docker network for DNS with container name.

# this will create <container name>.my-net DNS entry.
docker network create my-net

By default, RELEASE_DISTRIBUTION is sname (allowing non-FQDN), using release name and host id automatically for RELEASE_NAME

docker run --rm \
  --name snamenode \
  --network my-net \
  --env RELEASE_COOKIE=thisissecret \
  --env SECRET_KEY_BASE=+y5AreV1firmKw+kB9idUb0gp3lxi3Y5qhMntozh8P9xHS/+iq2wN1LH3ZALFLo7 \
  -p 4000:4000 \
  cluster-demo

To use name (with FQDN)

docker run --rm \
  --name namenode \
  --network my-net \
  --env RELEASE_COOKIE=thisissecret \
  --env RELEASE_DISTRIBUTION=name \
  --env RELEASE_NODE="custom@namenode.my-net" \
  --env SECRET_KEY_BASE=+y5AreV1firmKw+kB9idUb0gp3lxi3Y5qhMntozh8P9xHS/+iq2wN1LH3ZALFLo7 \
  -p 4001:4001 \
  cluster-demo

To run remote from the same cotainer (docker exec) - you don’t need to set anything since all are already there

docker exec -it snamenode bin/app remote
docker exec -it namenode bin/app remote

rpc works well from the same container

docker exec -it snamenode bin/app rpc "%{cookie: Node.get_cookie(), self: Node.self(), list: Node.list()} |> IO.inspect()"
# %{cookie: :thisissecret, list: [], self: :app@2b0c8dcd9e01}

docker exec -it namenode bin/app rpc "%{cookie: Node.get_cookie(), self: Node.self(), list: Node.list()} |> IO.inspect()"
# %{cookie: :thisissecret, list: [], self: :"custom@namenode.my-net"}

To connect the container from another container, you have to set the required RELEASE_* info

docker run --rm -it \
  --name nameremote \
  --network my-net \
  --env RELEASE_COOKIE=thisissecret \
  --env RELEASE_DISTRIBUTION=name \
  --env RELEASE_NODE="custom@namenode.my-net" \
  cluster-demo \
  bin/app remote

docker run --rm -it \
  --name remote \
  --network my-net \
  --env RELEASE_COOKIE=thisissecret \
  --env RELEASE_NODE="app@2b0c8dcd9e01" \
  cluster-demo \
  bin/app remote

If you don’t run both server and remote containers in the same docker network (or without network - which is bridge mode) - then connection between them won’t work while docker exec (in the same pod) works.

mhmtarif · September 18, 2020, 9:29am

~ #  ./erts-*/bin/epmd -names
epmd: up and running on port 4369 with data:
name my_app at port 41389

~ # ping -c3 -w3 10.40.1.120
PING 10.40.1.120 (10.40.1.120): 56 data bytes
64 bytes from 10.40.1.120: seq=0 ttl=64 time=0.079 ms
64 bytes from 10.40.1.120: seq=1 ttl=64 time=0.111 ms
64 bytes from 10.40.1.120: seq=2 ttl=64 time=0.098 ms

--- 10.40.1.120 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.079/0.096/0.111 ms


~ # nc -zv 10.40.1.120 4369; echo $?
1

~ # nc -zv 10.40.1.120 41389; echo $?
1

chulkilee · September 18, 2020, 1:09pm

This means epmd is not reachable via hose ip/port.

My result (using elixir:1.10.4-slim docker image)

nc -zv 127.0.0.1 4369; echo $?
localhost [127.0.0.1] 4369 (?) open
0

nc -zv 172.19.0.2 4369; echo $?
8011844e9d1a [172.19.0.2] 4369 (?) open
0

How erlang/elixir is installed?
Could you test 127.0.0.1 for epmd? Or check out netstat -tulpn | grep LISTEN to confirm epmd is running and listening on right interfaces.

netstat -tulpn | grep LISTEN
tcp        0      0 0.0.0.0:4369            0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:4000            0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:43617           0.0.0.0:*               LISTEN      -
tcp        0      0 127.0.0.11:38951        0.0.0.0:*               LISTEN      -
tcp6       0      0 :::4369                 :::*                    LISTEN      -

mhmtarif · September 18, 2020, 2:07pm

This means epmd is not reachable via hose ip/port.

Thank you very much. This helped me see my mistake.

bernardo · August 12, 2022, 10:38am

Hello, im also having the same issue, I get:

/app # nc -zv 10.0.11.16 4369; echo $?
1
/app # nc -zv 10.0.11.16 24031; echo $?
10.0.11.16 (10.0.11.16:24031) open
0
/app # nc -zv 10.0.11.16 14031; echo $?
10.0.11.16 (10.0.11.16:14031) open
0
/app # netstat -tulpn | grep LISTEN
tcp        0      0 0.0.0.0:14031           0.0.0.0:*               LISTEN      9/beam.smp
tcp        0      0 0.0.0.0:4031            0.0.0.0:*               LISTEN      9/beam.smp
tcp        0      0 0.0.0.0:24031           0.0.0.0:*               LISTEN      40/epmd
tcp        0      0 :::24031                :::*                    LISTEN      40/epmd

i see epmd is started on a different port that the default one, maybe that is the problem? how do i make sure empd is reachable via host ip/port?

lidashuang · June 22, 2023, 1:16pm

some issue

nobody@8:/app$ bin/pome_server_v2 remote
Erlang/OTP 25 [erts-13.2.2] [source] [64-bit] [smp:12:12] [ds:12:12:10] [async-threads:1] [jit:ns]

Could not contact remote node pome_server_v2@8, reason: :nodedown. Aborting...
nobody@8:/app$ erts-13.2.2/bin/epmd -names
epmd: up and running on port 4369 with data:
name pome_server_v2 at port 42547

phoenix 1.7.3
elixir 1.15.0