Hi,
I’m currently working on a final project/thesis for my CompSci course at the University of Cambridge. I have used Elixir for about a year now and I have decided to port the GoogleDataflow/Apache Beam SDK/local runner to Elixir.
The reference implementations are written in Java and Python, and while I can write well-structured functional code from scratch, porting something using so many OOP features into idiomatic Elixir is a challenge (and one which I intend to treat in my thesis), especially since a lot of the domain-specific concepts are realised in this OOP universe only for now.
I am starting by porting many of the classes used into modules, and replacing inheritance with behaviours (which is already resulting in cleaner code IMO), planning to possibly refactor some parts into a more idiomatic structure at a later point. The sticking point for me is the case where instead of just static functionality overrides, the functionality is parametrised in Python/Java by using instances of objects. For example, elements may be assigned to windows of time, and while an IntervalWindow
defines some functionality which conforms to that expected by a Window
, it needs different parameters from, say, a GlobalWindow
(which needs none).
The default answer for this may be “use protocols”, and indeed Flow uses structs for this purpose (but not protocols, as the windowing functions there do not have to be user-extensible). Indeed for many entities in my framework structs with protocols are the clear solution.
However for certain “behaviours” or strategies, apart from the parametrisation requirement, behaviours seem to be the right solution—a strategy for doing something seems to me like a behaviour, not a data structure. In many cases, any extensions will only need to implement a small subset of the available functionality, a perfect use case of behaviours along with default implementations and defoverridable
. Protocols seem to me to imply much looser semantics and restrictions, along the lines of “do this thing, but I don’t care how you do it”, versus the behaviours’ idea of “give me this small custom function which I will plug into my larger system”.
Thinking about this then, my first idea was to simply represent these “parametrised behaviours” as tuples of {Module, data}
where data
is arbitrary and simply passed along to all of the behaviour functions. But immediately afterwards, I realised this is just Erlang’s tuple modules, one of the things explicitly solved by protocols!
What I am currently doing is indeed using protocols “under the hood”, however I am also using a behaviour to enforce a stricter contract on the using modules. In my __using__
macro, I make the implementing (using) module implement the required behaviour, and define an automatic implementation for the protocol which calls into these behaviour functions (possibly with default implementations).
This way the user is encouraged to think of the module as a behaviour, a pluggable strategy if you will, which can also get a struct of itself as a parameter to its callbacks. Of course there is nothing stopping more advanced users from just providing their own implementation of the protocol, but the goal here is to make it as easy as possible for users to plug their custom functionality into the system.
My question is: is this a code smell? Too much “magic”? Should I just explicitly use protocols and make users explicitly call into the default implementations if they want them? Again, usually there will be 5–6 functions in such a module and a user may override 1 or 2 of them.
I’d appreciate any advice or insight people may have on this.