Practical difference between protocols and function pattern matching?

In my current project (first elixir project) I have different messages each represented by its own struct which defines the unique view of that message.

I then created a module to act as a handler for those messages, so I can call MyModule.handle(message, state). The actual implementation for the handle function uses pattern matching in the function definition so each function only handles the message type it knows (e.g. def handle(message = %MessageType1{}, state), do: ...).

It then occurred to me that this is the same functionality offered by protocols, and got me wondering if I should be using protocols for this scenario rather than function pattern matching.

The only advantages I could see with protocols is that external systems using a library that defines a protocol can extend it out to their own custom structs. On the other hand, function pattern matching allows you to code fall-through cases.

Are there any other practical reasons why I should choose one or the other?

4 Likes

Protocols are sort of an extension to pattern-matching:

  • Normal pattern-matching only allows the person writing ModuleX to add function clauses to a certain function.
  • Protocols allow both the person writing, as well as any people using ModuleX as dependency to add function clauses to one (or multiple) of ModuleX’s functions.

Which protocol-function-clause is used depends on the type of the first argument passed to the function, so in that way this extra freedom is somewhat bounded (but I have yet to find a case where this does not provide enough flexibility).

A third thing you might want to compare with pattern-matching/Protocols are Behaviours: Rather than swapping what what kind of data-type is passed to a function, you are swapping what module a certain function is called on. Behaviours are often used on modules that define a number of complex functions (often on a module that is run as a separate process/GenServer). These basically list a number of functions (including their typespecs) that a module that implements a certain behaviour has to define. When not following a Behaviour, a compile-time warning is shown.

2 Likes

As a rough rule of thumb, pattern matching in function heads is going to be faster than Protocols.

Protocols allow you to do some very nifty things, but unless you’re exposing an interface to the larger world, they probably shouldn’t be the first thing you reach for. If you want to apply the same “function” to many different data types, then Protocols work really well. In particular if you need to
do things to nested data types, you can use recursion and protocols to do powerful transformations with minimal code.

7 Likes

A third thing you might want to compare with pattern-matching/Protocols are Behaviours: Rather than swapping what what kind of data-type is passed to a function, you are swapping what module a certain function is called on.

I’m pretty surprised I didn’t think to use behaviours for this actually . I already am using behaviours for serializing and deserializing messages, so it makes perfect sense that all message specific handling logic would go with the message itself, instead of creating a module whose sole purpose is to try and handle every single message type. I think I wasn’t thinking about behaviours because of TDD, as it was encouraging me to just write a bunch of tests against a single module’s function and check the result.

Behaviours are also interesting because in theory I should be able to do:

struct_module = Map.get(Message, :__struct__)
struct_module.handle(state)

That also gives me the benefit of code locality and discoverability that I am wary of with protocols.

I’ll have to think about this some more.

3 Likes

It does make perfect sense if you’re doing OOP. But you’re not. Be careful. Honestly, I don’t know your use case well enough, but I’m initially skeptical if the multiple struct types are really necessary or if they are OOP thinking too.

For your first Elixir project, my advice is to keep it straightforward. Get everything working before you try on Protocols or Behaviours.

2 Likes

I’m writing an RTMP server that needs to handle all the different RTMP message and command types that come down the pipe. Each message has a different overall format (and subformats in some cases), and completely different data, and what happens (and what I respond with) is different depending on both the message that I receive and the current state of the server handling that specific connection.

To make the process testable the general flow flow is:

Received packet -> extract header and packet data -> parse data into message based on header -> process message.

Structs seemed like the best way to describe what data is contained by an individual message (and what’s not contained), and provide optimal pattern matching (especially for serializing messages to send over the wire, since the format of each message in binary form is different).

Not sure if that helps explain correctly why I’m using structs or if I’m just missing something that’s more “Elixir-like” (most of my experience is in C#, with minimal F# and Rust playing).

2 Likes

A networking protocol is classic erlang/BEAM territory :slight_smile: That would be done with gen_fsm and records. Structs are definitely the more elixir way to do it. It might be worthwhile to look at gen_fsm for you. I know there’s some efforts underway to write a new implementation in OTP, mostly because the existing one is skewed toward networking protocols, but that’s not really a problem in your case.

Helper functions in the module that defines the struct is great in general but “handle” seems a bit too coarse grained. That probably belongs in the gen* server that manages the specific connection. Yeah, back to the original question, that’ll mean a lot of pattern matching function heads, but you really do want low processing overhead in RTMP, so that’s my recommendation.

2 Likes

Yep, that’s a huge reason why I chose Beam for this project :).

The Learn You Some Erlang book was pretty confusing about gen_fsm so I’ll have to revisit it but I guess that’s true. This part of the system really is a big FSM so I probably should understand the gen_fsm stuff proper.

Thanks!

1 Like