Why is Access a behaviour instead of a Protocol? When to use a Protocol instead of a Behaviour?

protocols
behaviours
Tags: #<Tag:0x00007f039bd5b790> #<Tag:0x00007f039bd5b5d8>

#1

I am having some trouble understanding the difference between Behaviours and Protocols. This mostly is caused because the built-in Access functionality in Elixir is a Behaviour instead of a Protocol.

The odd thing is, that it seems that everything that a protocol does can be made using a Behaviour. Can someone shine light on what is going on?

Protocol:

  1. Someone defines a protocol using defprotocol, which has a name and one or multiple functions that need to be implemented (with the given arity). The first argument passed to these function implementations will always be the ‘thing’ that the Protocol is implemented on.
  2. In another module, someone defines an implementation for this protocol using defimpl.

Dispatching from the function call to the protocol implementations is automatic.

It is not possible to call the implementations directly on the final module, as they are secretly defined in another place that you cannot access yourself. Therefore, it also does not make sense to document the implemented functions.

Behaviours

  1. Someone defines a behaviour by just making a module and having multiple @callback statements in there, that take a spec; It seems that behaviours not only try to validate the arity of the callback implementations, but also e.g. the formats of the input/output data.
  2. To implement a behaviour, someone has to add use ModuleWithTheCallbackStatements to some module, and then simply define the callbacks of that behaviour in that module as functions.

Dispatching happens manually inside the module defining the behaviour (using e.g. for structs the module name inside the struct field); The callbacks may (using defoverridable) or may not be defined as functions in the module defining the Behaviour. This means that it is a lot easier to treat certain function inputs as ‘special’.

The implemented functions of a behaviour are simply defined and fully accessible and callable as normal functions. As such, they ought to be documented.


It seems to me that Behaviours are more flexible than Protocols. When is it a good idea to implement a Protocol rather than a Behaviour? Are there differences I’ve missed?


#2

Access was a protocol initially, but I think it was changed to a behaviour because of performance issues around protocol dispatch. It was simply too slow for something used as frequently as access.


#3

The @callback attribute doesn’t do as much as you might think, at best it just generates some compile time errors. You can actually use behaviours just fine w/o ever using the module that defines them.

A behaviour is just a promise that your module implements a function with given inputs.

A Protocol is a means for defining functions that potentially work for many different data types.

The way I think about it is that behaviours are for when you want to have a function for a single
set of args that potentially does different things. Protocols are for when you want to have a function that does the same thing for many different kinds of args.

You are correct that if you do enough hard work, you can probably implement a Protocol as a set of behaviours, but using Protocol does all the heavy lifting for you. This does not come for free though.

From my own work with Elixir I have some examples:

  • Write a function to find the documentation for a function in a Module in either Erlang or Elixir.

I wrote this as a behaviour. The arguments were exactly the same, but the underlying implementation was quite different. The idea was to have a list of Modules that implemented this behaviour and to call the function in each of the modules to find the documentation.

  • Create transformations of data structures without apriori knowledge of the data structure.

This is where Protocols shine because you can call the function defined in the Protocol recursively. Inspect is the example code to look at to really understand the power of Protocols.


#4

Thank you for your replies, @michalmuskala and @bbense. :slight_smile:

One other difference (maybe the important difference) that came to mind now, is that a Protocol doesn’t care where it is defined.

  • Person A could create a Protocol,
  • Person B could create a data structure doing something.
  • Finally, Person C could define an implementation of Protocol A for Struct B.

With behaviours, you’re limited to defining the Behaviour inside the module that defines the structure.
Maybe this is exactly why Access is defined as a Behaviour, by the way, to prevent people from adding it to external data structures at a later time.

@michalmuskala: Now that Elixir always consolidates protocols during standard compilation, these performance issues have gone away, right?


#5

I am especially curious. I would love to have Access implemented to call methods on Erlang style tuple modules:

iex> defmodule MyModule do
iex>   def a_method(i, tup), do: {i, tup}
iex>   def fetch(tup, key), do: {key, tup}
iex> end

iex> thing = {MyModule, :some, :more, :data}

iex> # This next line works right now because the underlaying Erlang system does it
iex> thing.a_method(42)
{42, {MyModule, :some, :more, :data}}

iex> # This next line I would love to have work like this (for my Array module for :array compatability without needing a struct), but right now Access throws a fit
iex> thing[42]
{42, {MyModule, :some, :more, :data}}

iex> # Although, much as I would love the above for consistency with Erlang, I would actually 'expect' this, but it does not happen either, Access throws a fit regardless
iex> thing[2]
:more

#6

I’m not sure I’m qualified enough to answer that.


#7

@michalmuskala: Now that Elixir always consolidates protocols during standard compilation, these performance issues have gone away, right?

Is there an “official” answer to this questions? I think it is quite valid.

I found this thread by trying to figure out the definitive differences between Behaviours and Protocols and looking at the implementation of Access, it seems to be perfectly fit to be implemented as a Protocol because it is all about polymorphism (e.g. https://stackoverflow.com/questions/26215206/difference-between-protocol-behaviour-in-elixir).

I guess one could create an Access implementation as a Protocol, do measurements and see if anything has changed.


#8

I asked @josevalim about it in the IRC group today directly. His reply:

(10:19:50 PM) josevalim: you only consolidate inside a project and after compilation
(10:20:03 PM) josevalim: this means all the access during compilation and in scripts still won’t be fast enough, which is a no-go

which is a very clear answer. Thank you very much, José! :heart_eyes:


#9

Thanks a lot to both of you for sharing this!