No longer able to connect local Livebook to remote Fly.io node

Hi friends,

This topic has come up a handful of times before on the forum, but I wanted to solicit some advice on how to debug an attached node connection issue I’ve started to experience with Livebook.

Context & setup

This setup was previously working. In fact, I posted on this forum to share how I got it set up and the script I was using to start the Livebook server with the correct flags. Sometime in the last couple of weeks, this setup has stopped working. I’ll describe exactly what I’m seeing a bit further down.

To get it out of the way, here are the versions of everything I’m using:

  • Latest Livebook - 0.8.1 (though also tried with 0.7.2, 0.7.1, and 0.7.1)
  • Latest Elixir/OTP - 1.14.3 / 25.2.1 (using same versions both locally and on server)

Other bits and bobs:

  • Connection is through a wireguard tunnel between me and the server, as described in this article on fly.io.
  • The node name and cookie is being set as expected.
  • Livebook is being started with -proto_dist inet6_tcp.
  • I am able to connect to the node through iex - more on that below.

Description of issue

After launching the Livebook server, opening a notebook, and attempting to connect to the attached node, Livebook hangs until the terminal eventually reports this error:

12:35:20.007 [error] GenServer #PID<0.592.0> terminating
** (CaseClauseError) no case clause matching: {:badrpc, :nodedown}
    (livebook 0.8.1) lib/livebook/runtime/erl_dist.ex:74: anonymous fn/3 in Livebook.Runtime.ErlDist.load_required_modules/1
    (elixir 1.14.3) lib/enum.ex:2468: Enum."-reduce/3-lists^foldl/2-0-"/3
    (livebook 0.8.1) lib/livebook/runtime/erl_dist.ex:71: Livebook.Runtime.ErlDist.load_required_modules/1
    (livebook 0.8.1) lib/livebook/runtime/erl_dist.ex:60: Livebook.Runtime.ErlDist.initialize/2
    (livebook 0.8.1) lib/livebook/runtime/attached.ex:45: Livebook.Runtime.Attached.connect/1
    (livebook 0.8.1) lib/livebook/session.ex:1519: Livebook.Session.handle_action/2
    (elixir 1.14.3) lib/enum.ex:2468: Enum."-reduce/3-lists^foldl/2-0-"/3
    (livebook 0.8.1) lib/livebook/session.ex:878: Livebook.Session.handle_cast/2
Last message: {:"$gen_cast", {:queue_cell_evaluation, #PID<0.594.0>, "setup"}}

Critically, I am able to connect to the node manually through IEx. I can start a local iex using the following command, connect to the remote node, and run some code through Node.spawn(server, ...).

# ERL_AFLAGS="-proto_dist inet6_tcp" iex --name local@127.0.0.1 --cookie my-shared-cookie
iex> server = :"my-server-node-name@some:ipv6:address"

iex> Node.connect(server)
true

# Test module that prints "hello"
iex> Node.spawn(server, Hello, :world, [])
hello
#PID<...>

I’m not really sure where to go from here. Any suggestions would be greatly appreciated!

1 Like

To close the loop on this:

It had nothing to do with Livebook! An upgrade in WSL2 caused the MTU for the eth0 network interface to be set to 1280, which was apparently too small with the added size of the Wireguard protocol. So any normal network requests would be fine, but requests “wrapped” by Wireguard would fail (unless they could fit in a single packet).

Fixed by the following:

$ sudo ip link set dev eth0 mtu 1500