Unable to connect observer to the remote node from local

I followed this link to connect observer to the remote app. I followed the following steps:

epmd -names # on local
# returns: epmd: up and running on port 4369 with data:

ssh -L 4369:localhost:4369 -L 58769:localhost:58769 ubuntu@server-ip # on local for port forwarding
# epmd: up and running on port 4369 with data:
# name elixirapp at port 45547

erl -name debug@127.0.0.1 -setcookie sec_cookie  -hidden -run observer # on local to open observer

Via the last command, the observer window opens. When I click on Node on the menu, I get a option to connect to elixirapp@127.0.0.1. When I click on that I get the following error in the console:

15:50:39.963 [error] [node: :"elixirapp@127.0.0.1", call: {:observer_backend, :sys_info, []}, reason: {:badrpc, :nodedown}]

Extra Info:
Here is my vm.args on the server

## Name of the node
-name elixirapp@127.0.0.1

## Cookie for distributed erlang
-setcookie sec_cookie

## Heartbeat management; auto-restarts VM if it dies or becomes unresponsive
## (Disabled by default..use with caution!)
##-heart

## Enable kernel poll and a few async threads
##+K true
##+A 5

## Increase number of concurrent ports/sockets
##-env ERL_MAX_PORTS 4096

## Tweak GC to run more often
##-env ERL_FULLSWEEP_AFTER 10

# Enable SMP automatically based on availability
-smp auto

And I do have :runtime_tools present in mix.exs

...
  def application do
    [
      mod: {ElixirApp.Application, []},
      extra_applications: [:logger, :runtime_tools]
    ]
  end

...

Can anyone please let me know why I am getting nodedown error.?

Is there any step that I am missing to connect remote node via observer?

1 Like

I generally use https://github.com/zhongwencool/observer_cli
and the underlying https://github.com/ferd/recon library. Not as pretty, but easier to run on prod systems. Also look at https://www.erlang-in-anger.com/

1 Like

Thanks for the links. I will check it out.

But, any idea why the solution I am working on isn’t working?

Can you :net.ping(:"elixirapp@127.0.0.1") from your terminal to make sure it’s actually responding?

If not then check the epmd registration settings (fullname and all such).

If so then check that the observer library is included with the release product.

:slight_smile:

ssh -L 4369:localhost:4369 -L 58769:localhost:58769 ubuntu@server-ip # on local for port forwarding
# epmd: up and running on port 4369 with data:
# name elixirapp at port 45547

Perhaps a typo but you say: “elixirapp at port 45547 in the comment” but forwarding says port 57679

I’ve followed this in the past:
https://jmilet.github.io/elixir/erlang/observer/ssh/2016/10/09/observer-ssh-tunnel.html
which is a script which sets up the tunnel for you.

1 Like

No, I tried doing :net_admin.ping(:"elixirapp@127.0.0.1") but received :pang instead of :pong

Yes. My bad. I did fix the typo.

But now, when I am not getting option in the menubar of the available nodes. Also, I am unable to ping the remote node.

This should work, and just to be on the safe side, I’ve tried it out on one production, and it worked as expected.

So I expect you either have a typo (e.g. cookie, node name, or the wrong port forwarded), or there is some unusual condition none of us have seen before. To be on the safe side, try the following:

  1. Close all ssh sessions
  2. Kill the local epmd process
  3. Verify that on dev box no process is listening on the required ports (4369 and 45547)
  4. Establish ssh session and forward the required ports
  5. Verify that you can access these ports on the dev machine
  6. Verify the local node list with epmd -names
  7. On the production node, start a remote console, and check the node name and the cookie.
  8. Start a hidden Erlang node with the correct cookie and try to connect with Node.connect/1.

If the step 8 fails, then the observer won’t work too.

One other thing I remember is that I wasn’t able to make this work with older Erlang versions (IIRC it was 17 or maybe even 18), so you may want to double check which Erlang version do you have, and try with the newer one.

1 Like

@sasajuric

I have one doubt.

This link says that in order to have nodes connect remotely, one needs to make sure that it resolves correctly. In my setup, I haven’t done anything as such. My local system’s hostname is different from server’s hostname. Does that matter?

Do I need to configure /etc/hosts and /etc/resolv.conf too?

If the production node name is yourapp@foo.bar.baz, then you need to add /etc/hosts entry locally, to resolve the symbolic hostname to 127.0.0.1, but only after you’ve established the ssh connection (or alternatively use ssh with the IP address).

If the production node name is yourapp@127.0.0.1, then it should just work without any changes to /etc/hosts.

1 Like

@sasajuric

Thanks.

But I am unable to connect still.

The weird thing is, that I am unable to connect locally on the server either.

On Server, I got the node address and cookie by running:

./app_release/cflogs/bin/cflogs remote_console

> Node.self # elixirapp@127.0.0.1
> Node.get_cookie # test

On Server, now, I exit the console and then run:

iex --name debug@127.0.0.1 --cookie test
> Node.connect :"elixirapp@127.0.0.1" # false

This is strange. Am I missing something? Why is it returning false?

Edit:
Turned out I had different versions of elixir installed on the server. I fixed and ensured that only one version is present. I am able to connect to node on the same machine on the server now.

However, I am still unable to connect from local to prod. And, I have the same erlang version installed on both the machines.

1 Like

SSH tunneling over or…?

Because if not, you cannot access a 127.0.0.1 address remotely.

Ok. So I got it running finally. Apart from rectifying my initial mistake (had two versions of erlang running on the server), I did two things more this time, I configured the server to accept all incoming and outgoing connections from any port by doing:

sudo iptables -A OUTPUT -j ACCEPT
sudo iptables -A INPUT -j ACCEPT

I don’t know if this really helped or not.

Next, I changed the cookie value from vk1285h/I)a6Q>T=;yY!hLDNjkW5?xz:nunHon4Jz(};KK!([C~zOoqh]a^XH5}3 to test

And yes, if I change the cookie name to a simple and a short one, it works.

So when the cookie name was long (on the server) I ran this on local (after ssh tunneling):
iex --name debug@127.0.0.1 --cookie 'vk1285h/I)a6Q>T=;yY!hLDNjkW5?xz:nunHon4Jz(};KK!(C~zOoqh]a^XH5}3' --hidden -e ":observer.start" # doesn't connect to elixirapp@127.0.0.1

I encapsulated the cookie name inside single quotes so that it is evaluated properly by zsh. I thought that this wrapping of the cookie inside the quotes could be the issue. But then on local I ran ps aux | grep iex and checked the cookie name. It returned:

usr/local/Cellar/erlang/20.3.2/lib/erlang/erts-9.3/bin/beam.smp -- -root /usr/local/Cellar/erlang/20.3.2/lib/erlang -progname erl -- -home /Users/aash -- -pa /usr/local/Cellar/elixir/1.6.4/bin/../lib/eex/ebin /usr/local/Cellar/elixir/1.6.4/bin/../lib/elixir/ebin /usr/local/Cellar/elixir/1.6.4/bin/../lib/ex_unit/ebin /usr/local/Cellar/elixir/1.6.4/bin/../lib/iex/ebin /usr/local/Cellar/elixir/1.6.4/bin/../lib/logger/ebin /usr/local/Cellar/elixir/1.6.4/bin/../lib/mix/ebin -elixir ansi_enabled true -noshell -user Elixir.IEx.CLI -name debug@127.0.0.1 -setcookie vk1285h/I)a6Q>T=;yY!hLDNjkW5?xz:nunHon4Jz(};KK!(C~zOoqh]a^XH5}3 -hidden -extra --no-halt --erl -noshell -user Elixir.IEx.CLI +iex --name debug@127.0.0.1 --cookie vk1285h/I)a6Q>T=;yY!hLDNjkW5?xz:nunHon4Jz(};KK!(C~zOoqh]a^XH5}3 --hidden -e :observer.start

I don’t think there is any problem over here.

Maybe keeping long and complex cookie names create problem while communicating over the network. I don’t know.

But I got it working finally by keeping the cookie name short. Thanks everyone for the help!

2 Likes

That sounds… buggy… I’m using a long secret myself, so… o.O

I just did this on my machine:

Terminal A:

iex --cookie 'vk1285h/I)a6Q>T=;yY!hLDNjkW5?xz:nunHon4Jz(};KK!([C~zOoqh]a^XH5}3' --name a@127.0.0.1
Erlang/OTP 20 [erts-9.3] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [hipe] [kernel-poll:false]

Interactive Elixir (1.6.4) - press Ctrl+C to exit (type h() ENTER for help)
iex(a@127.0.0.1)1> Node.get_cookie
:"vk1285h/I)a6Q>T=;yY!hLDNjkW5?xz:nunHon4Jz(};KK!([C~zOoqh]a^XH5}3"

Terminal B:

iex --cookie 'vk1285h/I)a6Q>T=;yY!hLDNjkW5?xz:nunHon4Jz(};KK!([C~zOoqh]a^XH5}3' --name b@127.0.0.1                                                                                                            
Erlang/OTP 20 [erts-9.3] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [hipe] [kernel-poll:false]

Interactive Elixir (1.6.4) - press Ctrl+C to exit (type h() ENTER for help)
iex(b@127.0.0.1)1> Node.connect :"a@127.0.0.1"
true
iex(b@127.0.0.1)2> Node.list                  
[:"a@127.0.0.1"]

So it seems to work with long keys as well, even the one that has been mentioned as absolutely not working in this thread.

This was on zsh as well.

Oh, that’s wierd. I will check again if there is something else that I did wrong.