I am late to the party, but I still remember how I struggled to grok the difference, so I’ll add my 3 cents.
Superficially, those concepts are pretty similar: both allow to define a set of functions that we can attach to module/type. However, the devil lies in the details.
In programming, it is often true that the more restrictive a thing is, the more powerful it becomes, because it can provide more guarantees.
Protocols are the more restrictive and therefore more powerful of the two.
The restriction is that all the functions in the protocol need to start with the same data type.
The super power is that you can define the implementation separately from the protocol definition and type definiion.
E.g. authentication library can define %Account{}
struct, Jason defines Jason.Encoder
and you in your application can define implementation of Jason.Encoder
for %Account{}
struct without touching source code of either of them.
This trick is heavily used to decouple stuff.
Phoenix does not know anything about Ecto and can be used without it.
Ecto does not know anything about Phoenix and can be used without it.
However, Phoenix defines a couple of protocols like Phoenix.Html.FormData
that is implemented for Ecto.Changeset
in a totally separate repository phoenix_ecto
. This makes Phoenix and Ecto work wonderfully together without coupling them.
Behaviours are the less powerful, but more generic of the two.
They are defined on modules instead of data types. We could imagine Jason.Encoder
behaviour defining to_json
function. We could implement it for a module that defines our struct, but we could not do it for externally the %Account{}
struct from authentication library. This is the power loss in comparison to protocols.
However, we can define functions in a behaviour however we want. We are not restricted with all functions taking the same data type as their first argument. This allows us to build machinery that does some common operations and differs in details. E.g. GenServer
always does the same juggling with messages and timeouts and the actual business logic that is interesting is defined in callbacks.
Behaviours are often used for explicitly mocking modules:
- Define a behaviour
- Make production code implement it
- Make mock code also implement it
Another example could be a generic HTTP client, that performs calls, retries and handles failures in a common way, but you can provide different request data and parsing responses in callback modules.
Summary
Use protocols when:
- you have a set of functions working on a particular data type
- you want to allow adding more data types
Use behaviours when:
- you have a set of functions (this time they don’t have to have anything in common)
- you have some generic logic that calls those functions.