Waiting after starting a C-Node

mmyers · May 28, 2020, 8:42pm

I’m exploring different ways of interfacing with C code, and in testing C-Nodes, my ExUnit tests were failing randomly. I start the C-Node process using Port.open. And I found that if I call Process.sleep(50) after opening the Port, then all the tests succeed.

Is this a good way to do it?
Is there a better way to wait for the C-Node to be ready to accept a connection?

ityonemo · May 29, 2020, 1:51am

Use net_kernel.monitor_nodes/2, then launch, the wait to have a nodeup message delivered to your test and block on a receive; then continue with your tests.

mmyers · May 29, 2020, 3:23pm

I’m not sure :net_kernel.monitor_nodes will give me what I’m looking for. I want to know when it is safe to connect to the C-Node. And according to the Erlang docs:

A nodeup message is delivered to all subscribing processes when a new node is connected

Maybe just a retry with a timeout or limit on the number of retries will work.

benwilson512 · May 29, 2020, 5:50pm

How would the C node be connected, but not ready to be connected to?

mmyers · May 29, 2020, 5:57pm

How would the C node be connected, but not ready to be connected to?

Exactly. The C-Node is not connected initially, which is why net_kernel.monitor_nodes won’t work.
There is a delay between starting the C-Node process, and the C-Node having an open socket, ready to receive a connection from an Elixir/Erlang node. That delay is what I want to account for with something better than a fixed sleep time.

benwilson512 · May 29, 2020, 5:59pm

Which is what the messaging solution provides. You :net_kernel.monitor_nodes, and then receive do with a pattern to wait for the nodeup message.

mmyers · May 29, 2020, 11:32pm

Thanks for your feedback. I have a solution.

The problem I was seeing in my tests occurred when trying to send a message before the C node was listening.

    Elixir                        C node        
-------------      ------------------------------------
  Port.open()  --->   OS process starts running C node
                                    |
  Send message ---> (X) not ready, Epmd connection fails
                                    |
                     (ei_listen, ei_publish, ei_accept)
  Send message --->    Epmd connection to C node (OK)

Comments from @benwilson512 and @ityonemo made me think, “What if the C node initiated the connection back to the Elixir node, instead of the other way around?” And maybe that’s how they were thinking it worked (or should have worked) in the first place. The initial code was just from sample C node code found on online, where the C node published itself to Epmd and listened, waiting for Elixir nodes to connect, as diagrammed above.

So I changed the C node to initiate the node connection, rather than just waiting for a connection:

    Elixir                     C node
--------------      --------------------------------
Port.open()    ---> OS process starts running C node
                                 |
connected (OK) <---        ei_connect()

And after I remembered that C nodes are hidden, and used :net_kernel.monitor_nodes(true, node_type: :hidden), then I did get a :nodeup message!

This also simplified the code on the C node side, I can just call ei_connect, and don’t have to use ei_listen, ei_publish and ei_accept.

ityonemo · May 30, 2020, 12:18am

shoot! I should have been more specific and reminded you that C nodes are hidden.