Connecting Livebook to a node on a remote docker container

I have an Elixir app running on a remote server using Dokku. I’d like to connect to it from my local machine using Livebooks, via a SSH tunnel.

I can’t seem to make the remote container visible to my local Livebook.

I’ve tried the top StackOverflow comments and have tried setting RELEASE_NODE, RELEASE_NAME, and RELEASE_DISTRIBUTION. I have RELEASE_COOKIE configured and am using this.

I have tried using RELEASE_NODE as <app_name>@localhost, <app_name>@127.0.0.1, and <app_name>@<server_ip>.

I gather there’s some shenanigans trying to get EPMD to see the remote node. I’ve tried setting ERL_DIST_PORT to 9000, forwarding Dokku’s host 9000 to the container’s 9000, and opening an SSH tunnel with 9000 open.

Is there a straightforward-ish solution or problem I’m missing? I appreciate there’s a few moving parts.

Hey @k-p! There are a few elements

  1. The remote node needs to start distribution on a known port. I believe ERL_DIST_PORT is used by rebar3/relx, not the Elixir releases. To force a specific port you can do ELIXIR_ERL_OPTIONS="-erl_epmd_port 9000" (or pass it in vm.args).

  2. The distribution port needs to be forwarded to a local port, let’s assume it’s the same.

  3. The remote node needs to have 127.0.0.1 hostname (or if it uses a domain name, you can edit your local /etc/hosts to resolve it to 127.0.0.1).

  4. With the above steps, the node appears as if it was running locally. However, the missing piece is that the local EPMD doesn’t know about this node. You can register the node manually by running this script:

    # epmd_register_node.exs
    
    defmodule EPMD do
      def epmd_register_node(node_name, port) do
        # Registers the given node under the given port in EPMD.
        #
        # We open a TCP connection to EPMD and send the registration
        # request. We keep the socket open. The node is automatically
        # unregistered when the calling process terminates.
        #
        # See the EPMD protocol [1] and the reference implementation [2].
        #
        # [1]: https://www.erlang.org/doc/apps/erts/erl_dist_protocol.html#register-a-node-in-epmd
        # [2]: https://github.com/erlang/otp/blob/OTP-27.0/lib/kernel/src/erl_epmd.erl#L403-L433
    
        epmd_host = {127, 0, 0, 1}
        epmd_port = 4369
    
        case :gen_tcp.connect(epmd_host, epmd_port, [:binary, packet: :raw, active: false]) do
          {:ok, socket} ->
            request =
              <<
                # ALIVE2_REQ
                120::8,
                # Node distribution port
                port::16,
                # Node type (normal)
                77::8,
                # Protocol (TCP/IPv4)
                0::8,
                # Highest and lowest version of the distributino protocol,
                # see https://github.com/erlang/otp/blob/OTP-27.0/lib/kernel/include/dist.hrl#L88-L89
                6::16,
                6::16,
                # Node name
                byte_size(node_name)::16,
                node_name::binary,
                # Extra
                0::16
              >>
    
            data = <<byte_size(request)::16, request::binary>>
            :ok = :gen_tcp.send(socket, data)
    
            case :gen_tcp.recv(socket, 0) do
              {:ok, data} ->
                result =
                  case data do
                    # ALIVE2_X_RESP
                    <<118, result::8, _creation::32>> -> result
                    # ALIVE2_RESP
                    <<121, result::8, _creation::16>> -> result
                  end
    
                if result == 0 do
                  :ok
                else
                  :gen_tcp.close(socket)
                  {:error, "failed to register node in EPMD, result code: #{result}"}
                end
    
              {:error, reason} ->
                :gen_tcp.close(socket)
                {:error, "failed to receive response from EPMD, reason: #{inspect(reason)}"}
            end
    
          {:error, reason} ->
            {:error, "failed to connect to EPMD, reason: #{inspect(reason)}"}
        end
      end
    end
    
    
    EPMD.epmd_register_node("mynodename", 9000) |> IO.inspect()
    
    Process.sleep(:infinity)
    

    Make sure to change the node name at the end, it should be the base name, without the hostname part. The registration is kept until you kill the script.

Technically steps 3. and 4. could be done by using a custom EPMD module, but since we are talking about Livebook, the solution assumes no changes to the EPMD module. (In fact, Livebook already uses a custom EPMD for a similar purpose).

2 Likes

Hello! Thanks for your help and your script to register the node.

I make it as far as seeing connection refused in the SSH tunnel when I attempt to connect to the node so I can see Livebook is trying to connect over the port to the remote node. But I think it’s still stuck somewhere between 1-3.

  1. Done, and I can see that this port has been picked up if I cat /proc/net/tcp after entering the container. If I try to run bin/app remote for an IEX session I receive err address in use.
  2. Done
  3. This might be where I’m unstuck. I have set the RELEASE_NODE as app@127.0.0.1.

I’m quite a novice with distributed Elixir, apologies. If I enter the container, and run epmd -names I can see that epmd is still running on port 4369 and the node (name app at port 9000) is running on port 9000, with the config set in step one. Is that right? The argument to me reads as though epmd should be on port 9000.

1 Like

Just to make sure, you are using RELEASE_DISTRIBUTION=name, RELEASE_NODE=app@127.0.0.1 and the same cookie?

If I enter the container, and run epmd -names I can see that epmd is still running on port 4369 and the node (name app at port 9000) is running on port 9000, with the config set in step one. Is that right? The argument to me reads as though epmd should be on port 9000.

This is correct. The argument name is confusing, and in fact it’s going to change in the next OTP version. It is supposed to set the port that the node uses for distribution, not the EPMD port itself. So name app at port 9000 looks good.

If the connection is refused, the only thing I can think of is something with the forwarding. For the ssh forwarding maybe try 127.0.0.1 and not localhost, unless you already do that. Otherwise it could be the port forwarding to Docker.

1 Like