So I continued to use the library.
TLDR: it crashes in almost all places if networks is lost (because of Wind
lack of error handling).
It seems to work globally fine, except when network is lost.
I spelunked a bit and arrived to this understanding:
Underlying Mint
library raises, but Wind
wrapper does not handle it properly.
In wind client.ex
(Module Wind.Client
) handle_info/2
receives a damaged state (disconnected or closed) and fails in handling the error on this line:
{:ok, conn, websocket, data} = Wind.decode(conn, ref, websocket, message)
(it was quite hard for me to debug, I couldn’t find some simple/working way to decompile Elixir .beam files to find the proper line failing at first, with all the macros and using_ all over the place)
If we go to the decode/4
function of Wind
Module we can see another failing error handling case:
with {:ok, conn, [{:data, ^ref, data}]} <- Mint.WebSocket.stream(conn, message), ... do ...
(no else
, and this is actually the first place failing)
Besides when this function crashes (Wind.decode/2
first, but if we fix it, Wind.Client.handle_info/2
will also crash), the associated GenServer reboots and attempts to launch a new connection through handle_continue/2
in Wind.Client
Module.
Of course network is still down, so here is another fail (the infamous :nxdomain
), and we end up in sending {:stop, {:error, conn, reason}, state}
to the GenServer.
It could be good to fail (I would personally have preferred a ping
spaced more and more to attempt after n*i seconds with i growing over time), but the thing is that when the GenServer stops it crashes the whole application.
Maybe here you can do something to avoid this?
I’m not familiar with all the GenServer communication and possible events handling (maybe using Process.send_after(self(), ...
is a way), so I could not propose some PR in a reasonable time.
Bug is easy to reproduce, just switch off your wifi, or close your laptop for a minute