TCP-based library design guidelines/best practices?

So for my current project I have created a couple of small TCP-based libraries - one for controlling a specific brand of IP-based powerbars and a very small subset of the PJLink projector control protocol.

Both of those libraries are built around a TCP connection. A socket is opened, statuses are queried, command are sent and so on. Eventually the connection is closed either by the remote end (timeout) or explicitly by the client.

I see a couple of overarching issues that are common with this kind of situation:

  • The library depends explicitly on the :gen_tcp module. This makes testing hard. OTOH, I don’t want to mock out all the arcane invocation of :gen_tcp.connect/3 because some of them are protocol specific and some of them like the timeouts are best left to the caller. So some kind of wrapper for this kind of IO seems to be in order.
  • Sometimes the user of the library would want to do a one-off command (fire and forget), sometimes they would like to keep the connection open for a batch of query/command cycles. This means that the underlying TCP connection must remain open and passed to the caller inside some struct or some opaque object.
  • Sometimes the calls can be synchronous, sometimes the caller would need to make them async. Do I build this into the library or provide a sync version and let the caller make it async according to their needs?
  • Some parameters like connection timeouts etc must be configurable. How do I manage this without making my function signatures accept a billion keyword arguments?
  • What about if the underlying TCP connection is closed and the caller tries to send a command? Should we try to reconnect or let the :gen_tcp error/exception bubble up?

I’ve seen various approaches to this, sometimes a full-blown OTP application is started to encapsulate all this state and be able to monitor connection behaviours, and the opaque object you receive when connecting is just a PID that you pass to the various functions.

I just wonder if there are some idiomatic Elixir guidelines for this sort of think. I’ve read @michalmuskala’s very informative articles about error handling and configuration, and I wonder if there are similar articles out there that discuss library design issues.

Thanks!

Often when making something interact over a custom tcp protocol like that then I often make a genserver that emulates it first, you can test against that then. :slight_smile:

However, you probably want a TCP replayer. I though there was one for the BEAM somewhere…

I would consider the connection library - https://github.com/fishcakez/connection. It powers db_connection and ultimately most of the ecto adapters out there. It generally builds around the assertion that a connection process doesn’t need to be always connected and it’s a “normal” and acceptable state of a connection process to be actually disconnected.

1 Like

The more I think about it, the more I can discern a few distinct elements that might be useful to be separated:

  1. The wire protocol - taking care of converting raw bytes to and from native elixir types.
  2. The pipe - send and receive data through this.
  3. The state machine - A is sent, reply A’ is expected. When B is received, we go into state S, when C is received, go into state S’ and so on.

I’m not 100% certain if a clean separation can be had between all 3, but I’m fairly certain that the wire protocol can just be a bunch of functions (no state), the pipe could be any GenServer or equivalent with connect/disconnect/send/recv functions, and the meat of the state machine could be implemented in gen_statem?

Again coming from Python land, Twisted has this nice separation - you create Protocol subclasses that are something like the state machine, and they don’t know anything about what pipe is used to send/receive the data (but they do access the pipe to send, and have their callbacks called when data arrives). It allows the creating of portable protocols that can be used over TCP, stdio, whatever.

1 Like