Record Protocols

OvermindDL1 · July 6, 2017, 6:09pm

Mostly bikeshedding but maybe lead to a PR, who knows, so this is to get input from @josevalim and other primary Elixir devs as well as everyone else.

I’m curious, in the protocol definition you have this to handle structs:

github.com

elixir-lang/elixir/blob/master/lib/elixir/lib/protocol.ex#L459-L461




defp extract_matching_by_attribute(paths, prefix, callback) do
  for path <- paths,

And this to handle the ‘primitive types’:

github.com

elixir-lang/elixir/blob/master/lib/elixir/lib/protocol.ex#L464-L473


      do: mod
end


defp list_dir(path) when is_list(path) do
  case :file.list_dir(path) do
    {:ok, files} -> files
    _ -> []
  end
end

And more to handle the catch all and so forth.

What about right above/below the struct definition we have it add a match definition with a guard like (tagged_tuple) when is_tuple(tagged_tuple) and is_atom(elem(tagged_tuple, 0)) and elem(tagged_tuple, 0) in [@generated_record_list] where @generated_record_list is generated into the module at consolidation time of tagged_tuples/records that have a defimpl defined (perhaps with the syntax of for: Record(:my_record) or so?) and at non-consolidation time (slow lookup) it just tests if the submodule exists. Just like an implementation generates a module that is a submodule of the protocol module, the record one could generate something like :"Elixir.MyProtocol.Records.my_record", which follows the style of others as well (and honestly I’d prefer structs to be in a subnamespace of Structs or so instead of ‘top level’ but that makes sense as-is anyway, but this at least keeps Records out of the ‘structs’ namespace, although I would not see an issue in mixing their namespaces in any case).

Even much of the structs_impl code could be re-used with ease (verbatim a quick look appears to confirm too).

I would have a use for this feature though and with it being a whitelist you don’t have to really worry about tuples not being handled properly as normal either. Would a PR be accepted?

michalmuskala · July 6, 2017, 6:23pm

The problem with a protocol for tagged tuples is that they are not unique. AFAIK Elixir had this before structs were introduced and it was problematic.

OvermindDL1 · July 6, 2017, 6:26pm

That is precisely why I was proposing a whitelist for them. Plus I’d not use naked atoms with them (unless erlang libraries), I’d use an atom like MyModule.MyRecord. Could even enforce that it would only work with at least one level of ‘namespacing’ before the record ‘name’, I’d be fine with that (though allowing naked erlang’y atoms would be useful for erlang work, perhaps with a caveat or a special name other than just Record or so?).

OvermindDL1 · July 6, 2017, 6:37pm

For a tagged tuple could even specify a non-first parameter to match on as well, perhaps like for: TaggedTuple(MyModule.MyBlah, 3) for the 3rd index’d position.

Could do a generic match, which would handle about any situation, perhaps like for: Match({:my, :special, %{structure: value} when is_integer(value))

Although if all the weird stuff like Integer and other special cased oddities like that were dropped and just a normal matchspec were used like for: {:my, :special, %{structure: value} when is_integer(value) or for: %MyStruct{} then just elixir/erlang style normal matches (this would break backwards compatibility then) could then match anything on a protocol efficiently. Someone could even special case if, say, an integer was below 0 or whatever. It would make basic lookup slower pre-consolidation (although there are ways to fix that by breaking the current style even more) but post-consolidation it would be far more powerful.

josevalim · July 6, 2017, 7:17pm

Even if you say it applies only for MyModule.MyRecord, we would still need to convert the atom to a string and check if it starts with a “Elixir.” prefix. And sure, consolidation helps, but scripts, mix tasks, compilation, etc do not rely on consolidation.

And tuples are just too common. Imagine inspecting a keyword list. Now for each tuple in the list, we have the additional cost of checking the first element of the tuple and, if it is an atom that starts with “Elixir.”, attempt to do a dispatch, which may not exist. This cost of false positives are just too high.

We have made this mistake in the past, we have no plans in repeating it.

OvermindDL1 · July 6, 2017, 7:43pm

Any ideas on how to dispatch on a user-defined record through a protocol then?

Qqwy · July 6, 2017, 10:33pm

@OvermindDL1 What I did in FunLand was to create logic that is partially overlapping with what Protocols do for you, but that is based off of Behaviours. Amongst other things, it matches success tuples (That is things in the form of {:ok, val} | {:error, reason} | :error to a module containing implementation behaviour, similar to how the Protocol module maps [] to List, 1 to Integer, etc.

OvermindDL1 · July 6, 2017, 10:36pm

I had a few minutes so I whipped up a ProtocolEx library (unpublished, it is on bitbucket currently, I can move it to github if anyone is curious though).

Let’s define a new protocol:

  defprotocolEx Blah do
    def empty(a)
    def succ(a)
    def add(a, b)
  end

And let’s implement it for, oh, integers and a custom type (a tagged tuple that holds an integer):

  defimplEx Integer, i when is_integer(i), for: Blah do
    def empty(_), do: 0
    def succ(i), do: i+1
    def add(i, b), do: i+b
  end

  defimplEx TaggedTuple.Vwoop, {Vwoop, i} when is_integer(i), for: Blah do
    def empty(_), do: {Vwoop, 0}
    def succ({Vwoop, i}), do: {Vwoop, i+1}
    def add({Vwoop, i}, b), do: {Vwoop, i+b}
  end

Now let’s consolidate it, I’ve not made a compiler for it yet so right now just call this, oh, anywhere, it will make sure the necessary other modules are compiled first and so forth before consolidating:

  ProtocolEx.resolveProtocolEx(Blah, [
    Integer,
    TaggedTuple.Vwoop,
  ])

Right now when something is defimplEx’d, like the Integer one, it just makes a Blah.Integer right now. The name Integer is not special in any way, it can be any atom, but I can have the resolver scan the BEAM’s as a mix compiler plugin to get any modules with a name after the protocol module name with ease and thus build up the list that way, but right now just do it manually so that everything gets automagically required in the right order and all.

You can of course call a specific implementation straight:

    assert 0 == Blah.Integer.empty(42)

Or do it through the extended protocol directly:

    assert 0 === Blah.empty(42)
    assert {Vwoop, 0} === Blah.empty({Vwoop, 42})

    assert 43 === Blah.succ(42)
    assert {Vwoop, 43} === Blah.succ({Vwoop, 42})

    assert 43 === Blah.add(42, 1)
    assert {Vwoop, 43} === Blah.add({Vwoop, 42}, 1)

The Blah module is basically compiling in to this (yes this was Macro.to_string’d, so this is what it is compiling to):

defmodule Blah do
  def(empty(i = a) when is_integer(i)) do
    Testering.Blah.Integer.empty(a)
  end
  def(empty({Vwoop, i} = a) when is_integer(i)) do
    Testering.Blah.TaggedTuple.Vwoop.empty(a)
  end
  def(empty(value)) do
    raise(%ProtocolEx.UnimplementedProtocolEx{name: :empty, arity: 1, value: value})
  end
  def(succ(i = a) when is_integer(i)) do
    Testering.Blah.Integer.succ(a)
  end
  def(succ({Vwoop, i} = a) when is_integer(i)) do
    Testering.Blah.TaggedTuple.Vwoop.succ(a)
  end
  def(succ(value)) do
    raise(%ProtocolEx.UnimplementedProtocolEx{name: :succ, arity: 1, value: value})
  end
  def(add(i = a, b) when is_integer(i)) do
    Testering.Blah.Integer.add(a, b)
  end
  def(add({Vwoop, i} = a, b) when is_integer(i)) do
    Testering.Blah.TaggedTuple.Vwoop.add(a, b)
  end
  def(add(value, _)) do
    raise(%ProtocolEx.UnimplementedProtocolEx{name: :add, arity: 2, value: value})
  end
  # Snip a horror of metadata stored in `__protocolEx__` and such things...
end

So you do not need to guard your callback functions as the guard on the defimplEx handles that for you (or do, whatever).

But yes, as seen you can match based on anything, so matching on a struct would be added to the prior example as:

  defmodule MyStruct do
    defstruct a: 42
  end

  defimplEx MineOlStruct, %MyStruct{}, for: Blah do
    def empty(_), do: %MyStruct{a: 0}
    def succ(s), do: %{s | a: s.a+1}
    def add(s, b), do: %{s | a: s.a+b}
  end

  ProtocolEx.resolveProtocolEx(Blah, [ # Matchers are processed in order as below
    Integer,
    TaggedTuple.Vwoop,
    MineOlStruct,
  ])

Which generates:

defmodule Blah do
  def(empty(i = a) when is_integer(i)) do
    Testering.Blah.Integer.empty(a)
  end
  def(empty({Vwoop, i} = a) when is_integer(i)) do
    Testering.Blah.TaggedTuple.Vwoop.empty(a)
  end
  def(empty(%MyStruct{} = a)) do
    Testering.Blah.MineOlStruct.empty(a)
  end
  def(empty(value)) do
    raise(%ProtocolEx.UnimplementedProtocolEx{name: :empty, arity: 1, value: value})
  end
  def(succ(i = a) when is_integer(i)) do
    Testering.Blah.Integer.succ(a)
  end
  def(succ({Vwoop, i} = a) when is_integer(i)) do
    Testering.Blah.TaggedTuple.Vwoop.succ(a)
  end
  def(succ(%MyStruct{} = a)) do
    Testering.Blah.MineOlStruct.succ(a)
  end
  def(succ(value)) do
    raise(%ProtocolEx.UnimplementedProtocolEx{name: :succ, arity: 1, value: value})
  end
  def(add(i = a, b) when is_integer(i)) do
    Testering.Blah.Integer.add(a, b)
  end
  def(add({Vwoop, i} = a, b) when is_integer(i)) do
    Testering.Blah.TaggedTuple.Vwoop.add(a, b)
  end
  def(add(%MyStruct{} = a, b)) do
    Testering.Blah.MineOlStruct.add(a, b)
  end
  def(add(value, _)) do
    raise(%ProtocolEx.UnimplementedProtocolEx{name: :add, arity: 2, value: value})
  end
  # Snip a horror of metadata stored in `__protocolEx__` and such things...
end

But yeah, without the compiler stage (plus I think I’d want to add in priority setting too) it does require an extra call over normal protocols, but it allows building up a final complete module with ease. I’m even thinking of adding in an extend Blah declaration so you can import the specifications from another protocol so that implementers here have to fulfill both, would be easy to add in.

Yeah this basically works like behaviours + a consolidation step for a baked set of behaviour implementations, which makes it very fast.

@Qqwy What are your thoughts on this setup?