I currently try to send a network connection to another distributed node. While doing that, the socket closes.
The socket is created via :gen_tcp.
I know this is a hard problem and also possibly incredibly hard to solve, I just figured it was a common task and expected an easy solution.
@spec delegate_socket(socket :: :gen_tcp.socket()) :: nil
defp delegate_socket(socket) do
delegate_node = Pool.get_node
IO.puts "Sending socket to delegate node #{delegate_node}"
IO.puts "#{Node.ping delegate_node}"
pid = Node.spawn(delegate_node, fn ->
IO.puts "running on node"
receive do
socket ->
Server.Tcp.Handler.handle(socket)
end
end)
send pid, socket
# :ok = :gen_tcp.controlling_process socket, pid
end
The Handle Function:
defmodule Server.Tcp.Handler do
def handle(socket) do
IO.puts "got in loop"
socket
|> read_line()
|> write_line(socket)
handle socket
end
defp read_line(socket) do
{:ok, data} = :gen_tcp.recv(socket, 0)
data
end
defp write_line(data, socket) do
:gen_tcp.send(socket, data)
end
end
If that’s just impossible, which is very reasonable, I would now a workaround, but that could be very, very inefficient.
So how would the expert Elixir/Erlang programmer proceed here?
I can’t really imagine that someone would plug an Ngnix or Traefik in front of there Elixir to do the load balancing.
I am not sure how gen_tcp works under the hood but if that code works I would bet that the connection is still maintained by the orignal node where it was open. So basically all in/out messages will have to pass through that first node.
:gen_tcp can use a port under the hood, which you can’t send to another node.
Take a look at Cowboy for a good example of using an acceptor pool, but note that that runs on a single node. If you wanted to do the heavy work on another node, you could use a loop handler that starts the work and waits for a reply.
Why? Those are both useful tools, and unless you have a specific need they don’t address you’re not going to gain much by trying to re-implement them by hand in the BEAM.
I don’t think that you can send a socket, because that is backed by an underlying OS socket, and that cannot be sent over. But:
if you control the protocol, you could implement a redirect, or just “hang up” from nodes that are too busy and wait for the client to reconnect
You usually receive data to do something with it. Reading a socket, doing basic framing/cleanup, and sending the result elsewhere for processing is very lightweight. So you may not really care about load balancing, because your acceptor could be so lightweight that you just don’t care. Databases, transcoding, whatever, go to dedicated compute nodes.