Implementing the distribution protocol in other languages

MassiveFermion · March 4, 2019, 9:57am

Erlang’s distribution protocol seems simple and straight forward. So its really tempting to try and implement it in other languages which I think opens up many interesting possibilities. I know there are already a few implementations out there but I want to understand it myself.
So I’m trying to read the specification and the code .
But I haven’t worked with tcp directly before, so I need to clarify some confusions.
Consider the ALIVE2_REQ request. The specification says that it has the following form:

1	2	1	1	2	2	2	Nlen	2	Elen
120	PortNo	NodeType	Protocol	HighestVersion	LowestVersion	Nlen	NodeName	Elen	Extra

Does it mean that I should just write these fields back to back with no delimiters between them?
So if I want a node with the name “node” that listens to port 9506, I should write the following to the tcp socket(of course after converting it to binary)?
1209506770554node0

Is that just it? because I used node js to do it but I’m not sure it worked.
I opened iex with a short name, and then opened node and wrote the code below:

let sock=new net.Socket()
sock.connect(4369)
sock.write(Buffer.from('1209506770556client0'))

it returned true but when I called Node.list in iex, it returned an empty list which means the node was not introduced.

Am I missing something here?
Thanks

NobbZ · March 4, 2019, 10:20am

No.

The table specifies bytes.

Therefore you need to send the equivalent of the fololowing:

<<120 :: size(8), 9506 :: size(16), nodetype :: size(8), protocol :: size(8), hv :: size(16), lv :: size(16), 4 :: size(8), "node", 0>>

Whereas you need to fill in apprpriate values for nodetype, protocol, hv, and lv. I was not sure where values start and end in your example data.

MassiveFermion · March 4, 2019, 10:53am

As far as I know that’s a bitstring. How can I make something compatible with that in other languages? Is it an array?

NobbZ · March 4, 2019, 11:03am

It totally depends on the language you want to implement the protocol in.

Usually you write from a buffer of bytes to sockets.

To be able to tell you more, we need to know the language you have in mind.

MassiveFermion · March 4, 2019, 11:16am

For now I’m using node js.
But we’re using raw bytes over tcp here. Which means its just about the correct representation, and has nothing to do with implementation details of erlang data types. I’m confused.

NobbZ · March 4, 2019, 11:20am

Yeah, raw bytes are raw bytes. And as JS has not even a notion of integers, I have no clue how it would represent bytes…

As an array of bytes though it were like [120, 37, 34, type, proto, hv_high. hv_low, lv_high, lv_low, 4, 110, 111, 100, 101, 0].

edit Maybe I got some byte-orders wrong, also I just realize the len should be 2 bytes, rather than one, but I won’t fix that now. A better example can be seen in @hauleth’s C code.

hauleth · March 4, 2019, 11:21am

Example in C:

char node_name[] = "node";
char extra[] = {};

uint16_t name_len = strlen(node_name);
uint16_t extra_len = 0;
uint16_t port = 9506;

char *buf = (char*)malloc(13 + name_len + extra_len);

buf[0] = 120;
buf[1] = (port & 0xff00) >> 8;
buf[2] = port & 0x00ff; // Erlang protocol uses network order (aka big-endian)
buf[3] = 72; // C-node
buf[4] = 0; // TCP/IP
buf[5] = 0;
buf[6] = 5;
buf[7] = 0;
buf[8] = 5;
buf[9] = (name_len & 0xff00) >> 8;
buf[10] = name_len & 0x00ff;

strcpy(buf + 11, name);

buf[name_len + 12] = (extra_len & 0xff00) >> 8;
buf[name_len + 13] = extra_len & 0x00ff;

EDIT:

I haven’t tested it, and have wrote that from memory, but in general it should show you the idea how this should be done.

MassiveFermion · March 4, 2019, 12:37pm

I tried this:

let sock=new net.Socket()
sock.connect(4369)
let msg=[120,(9506&0xff00)>>8,9506&0xff00,77,0,0,5,0,5,(4&0xff00)>>8,4&0x00ff,...Buffer.from('node'),(0&0xff00)>>8,0&0xff00]
sock.write(msg)

But it didn’t work either.
So I tried in elixir to see whether it works or not:

{:ok,sock}=:gen_tcp.connect('localhost',4369,[:binary])
:gen_tcp.send(sock,<<120 :: size(8), 9506 :: size(16), 77 :: size(8), 0 :: size(8), 5 :: size(16), 5 :: size(16), 4 :: size(8), "node", 0>>)

It returns :ok, but Node.list in the other iex is still empty. Also when I flush, I get {:tcp_closed,#Port<x.xx>}. So apparently there is more to it than the correct message representation!

dch · March 8, 2019, 3:37pm

So there are 2 things here. you are thinking you are talking to an erlang node, but epmd is actually what you’re talking to.

What you actually want is this:

$ iex --name ding --erl  '-kernel inet_dist_listen_min 4370 inet_dist_listen_max 4370'

And then use port 4370 in your code above in place of the epmd port.

TLDR follows for those who seek further enlightenment.

epmd

this is the per-node** instance of a name lookup service. You can think of it as zeroconf/bonjour/MDNS for BEAM. this is what listens on port 4369 by default.

I suggest you use wireshark to deconstruct these, they are not really secured and the tool has inbuild epmd debugging already https://wiki.wireshark.org/EPMD

You can also run epmd in foreground debugging mode, like so:

$ epmd -d -d -d
epmd: Fri Mar  8 15:07:14 2019: epmd running - daemon = 0
epmd: Fri Mar  8 15:07:14 2019: try to initiate listening port 4369
epmd: Fri Mar  8 15:07:14 2019: there is already a epmd running at port 4369

Which on my system makes sense, there’s already a running epmd. You can start epmd, then with a single iex non-distributed node, you can manually connect:

iex(1)> :net_kernel.start([:somenode, :shortnames])
{:ok, #PID<0.109.0>}

And your debug epmd spits out this:

epmd -d -d -d
epmd: Fri Mar  8 15:10:56 2019: epmd running - daemon = 0
epmd: Fri Mar  8 15:10:56 2019: try to initiate listening port 4369
epmd: Fri Mar  8 15:10:56 2019: entering the main select() loop
epmd: Fri Mar  8 15:11:01 2019: time in seconds: 1552057861
epmd: Fri Mar  8 15:11:04 2019: Local peer connected
epmd: Fri Mar  8 15:11:04 2019: time in seconds: 1552057864
epmd: Fri Mar  8 15:11:04 2019: opening connection on file descriptor 6
epmd: Fri Mar  8 15:11:04 2019: time in seconds: 1552057864
epmd: Fri Mar  8 15:11:04 2019: got 23 bytes
***** 00000000  00 15 78 e9 30 4d 00 00  05 00 05 00 08 73 6f 6d  |..x.0M.......som|
***** 00000010  65 6e 6f 64 65 00 00                              |enode..|
epmd: Fri Mar  8 15:11:04 2019: time in seconds: 1552057864
epmd: Fri Mar  8 15:11:04 2019: ** got ALIVE2_REQ
epmd: Fri Mar  8 15:11:04 2019: time in seconds: 1552057864
epmd: Fri Mar  8 15:11:04 2019: registering 'somenode:2', port 59696
epmd: Fri Mar  8 15:11:04 2019: type 77 proto 0 highvsn 5 lowvsn 5
*****     active name     "somenode#2" at port 59696, fd = 6
*****     reg calculated count  : 1
*****     unreg counter         : 0
*****     unreg calculated count: 0
epmd: Fri Mar  8 15:11:04 2019: got 4 bytes
***** 00000000  79 00 00 02                                       |y...|
epmd: Fri Mar  8 15:11:04 2019: ** sent ALIVE2_RESP for "somenode"
epmd: Fri Mar  8 15:11:09 2019: time in seconds: 1552057869
epmd: Fri Mar  8 15:11:14 2019: time in seconds: 1552057874
epmd: Fri Mar  8 15:11:19 2019: time in seconds: 1552057879
epmd: Fri Mar  8 15:11:24 2019: time in seconds: 1552057884
epmd: Fri Mar  8 15:11:29 2019: time in seconds: 1552057889
epmd: Fri Mar  8 15:11:34 2019: time in seconds: 1552057894
epmd: Fri Mar  8 15:11:39 2019: time in seconds: 1552057899
epmd: Fri Mar  8 15:11:44 2019: time in seconds: 1552057904
epmd: Fri Mar  8 15:11:49 2019: time in seconds: 1552057909
epmd: Fri Mar  8 15:11:51 2019: time in seconds: 1552057911
epmd: Fri Mar  8 15:11:51 2019: unregistering 'somenode:2', port 59696
*****     reg calculated count  : 0
*****     old/unused name "somenode#2"
*****     unreg counter         : 1
*****     unreg calculated count: 1
epmd: Fri Mar  8 15:11:51 2019: closing connection on file descriptor 6

There are multiple implementations of epmd, from the upstream source, a couple in go, rust, and also my favourite, this spoofing one in erlang: https://github.com/msantos/spoofed

The other thing is the erlang distribution protocol itself…

To establish a connection between 2 nodes, you need to know the other node’s address. You can obtain this via epmd lookup, or if you are sneaky, you can pre-calculate these and then you don’t need epmd at all. See https://www.erlang-solutions.com/blog/erlang-and-elixir-distribution-without-epmd.html for more details. The first paragraph saves me writing most of this, so go read it.

distribution

As you read the post, now spin up 2 nodes.

$ iex --name ding --erl '-kernel inet_dist_listen_min 4370 inet_dist_listen_max 4370'
Erlang/OTP 21 [erts-10.2.4] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [hipe] [dtrace]
Interactive Elixir (1.8.1) - press Ctrl+C to exit (type h() ENTER for help)

$  iex --name dong --erl '-kernel inet_dist_listen_min 4371 inet_dist_listen_max 4371'
Erlang/OTP 21 [erts-10.2.4] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [hipe] [dtrace]

Interactive Elixir (1.8.1) - press Ctrl+C to exit (type h() ENTER for help)

Now in one node, you can Node.ping :"dong@... and run Node.list() in the other. The reason this is useful is that you can also use wireshark on that port 4370 to see the traffic “exercise left to reader”. If your system is multi-homed you may need to use inet_dist_use_interface {127,0,0,1} to restrict that to a specific network interface.

MassiveFermion · March 9, 2019, 4:51am

wireshark doesn’t capture anything when I send messages between erlang nodes!
I tried monitoring port 4370 with your configuration for the node. Then I tried 4369 for two ordinary nodes. Then I started monitoring with no packet filter but with display filter set to erldp. But wireshark didn’t capture any packets at all.

tty · March 9, 2019, 9:56am

When a BEAM node starts up in distributed mode it checks if epmd is running and starts it up if it has to. It then registers with epmd the port this node is listening to. Unless specified on the command line the port is randomly assigned by the OS. epmd stores a list of nodename/port mappings similar to DNS.

epmd is a registered IANA service (see /etc/services) and uses port 4369.

On the same server two or more distributed BEAM nodes would be registered on the same local epmd. When Node A wants to communicate with Node B it queries epmd for the port Node B is listening to (e.g. PortB). With this Node A communicates directly to Node B via PortB.

If Node A is communicating with Node C on a different server, Node A queries its epmd. epmd would note that Node C is on a different server and it would query the epmd of the remote server via port 4369. It then passes the resulting PortC to Node A. With this Node A communicates directly to Node C via PortC.