Qqwy
Putting non-protocol functions in a protocol?
As you know, Elixir protocols dispatch to the particular implementation based on the first argument passed to their functions.
For many use-cases this is fine.
Sometimes, however, it is not.
One such situation is when we want to add a function that creates a datatype in a generic way. At this time we probably know the module name of the implementation we’d like to use, but do not have a struct of that type.
This leaves a couple of different possibilities. I am hoping for some feedback on which technique you’d prefer:
1. Have a separate Behaviour module. Structs implementing the protocol should implement this behaviour for structs as well.
Advantage:
- Does not need any ‘hacks’.
Disadvantage:
- It is easy to miss/forget implementing the behaviour
- It might be confusing that there both is a protocol and a behaviour that together specify the interface that needs to be implemented.
2. Add a function to the protocol which does not actually expect the struct as first argument.
Advantage:
- Only a single module which defines the interface.
Disadvantage: Seems a bit like a ‘hack’:
- It requires manual protocol dispatch, which is hackish as since we do not have a struct of the protocol yet (but only a module name), we cannot rely on
ProtocolName.impl_for(datatype). Manually concatenating module names currently works, but seems like relying on an implementation detail. - It might mess with protocol consolidation.
- Elixir and/or tools like Dializer or Credo might produce warnings.
3. Using a library-provided ‘extended protocol’
One example of a library providing extended protocols would be protocol_ex.
Advantage:
- it might be possible to implement this pattern directly.
Disadvantage:
- It might be overkill
- Improved developer complexity: It’s a new library that developers need to understand.
- Circumventing normal protocols will mean that improvements to normal protocols (like e.g. better consolidation) cannot be used.
4. ???
Maybe there are other possibilities as well?
If you need more context, this recently came up here, PR #32 of the Arrays library.
Most Liked
eksperimental
First of all I would like to mention that defining callbacks with the @callback directive is discouraged.
While it can still be used, ExDoc will not list these callbacks in the protocol documentation.
See this issue for reference.
The approach that I ended up taking was to define a submodule called Behaviour, where you can end up placing all your @callbacks. There is a caveat though, that the functions that define functions/macros are not available, this is due to the way protocol is implementing disabling these functions in the parent module. One way to solve this is to create a module outside the protocol definition (defmodule MyProtocol.Behaviour).
But for clarity I preferred to keep it as a submodule.
You can see the implementaiton of this here:
Please let me know what you think about this approach
Qqwy
It seems to me like we are treading in unexplored waters, and am eager to learn what people of the Elixir core have to say about this situation.
Qqwy
Let me give some extra context. There definitely are situations in which I’d go for @LostKobrakai 's approach, but it cannot be used here.
Arrays and some similar libraries (e.g. okasaki, sets, prioqueue) have a unified interface module (in this case Arrays) which contains some generic code. For some functions this generic code calls a particular protocol implementation.
This pattern is common elsewhere in Elixir. For instance we have an Enum module which contains generic code that internally uses the Enumerable protocol implementations.
The idea is that user code should (only) use the unified interface, and that they can specify in configuration (either in config.exs or by passing explicit options to Arrays.empty/1 or Arrays.new/2) which array implementation they want to use.
This works great, except when actually creating the initial structs (. Here we cannot dispatch to the protocol implementation because we do not have a struct yet. We only have the module name.
And that is where the conundrum lies.
What do we do for this situation?
Adding a function (also called empty) to the module that contains the defstruct as @al2o3cr is indeed the current approach.
To make it slightly more clear that we need this module to implement this particular function we use a behaviour.
However, this means we have ánd a behaviour ánd a protocol, with the pros and cons outlined in the first post above.
We’re looking for ways to make the interface of the library as a whole better.
(For both users as well as implementers of new e.g. array backends).








