🍵 Matcha - first-class match specifications for Elixir

DEVLOG.md 2023-10-20

This update discusses new syntactic support for nested matches I want to experiment with for Matcha!


I started developing Matcha.filter/1 because I needed it for something I wanted to build for SpawnFest. Sadly, while developing the unannounced library I wanted to release first and use in the competition, I’ve ran into a limitation of matchspec’s expressivity I’ve long aspired to overcome. I’ve suspected for a while that it’s solvable, but solving it properly in Matcha will take too much time away from the development of the depending library I’d planned on using in the competition—it’s a hard blocker. I may submit something else, though!

My consolation prize is that my remorse has fueled me to think on the problem of nested matches more, and recent changes to the Matcha compiler to support filters should give me what I need to implement it. Let’s dive into nested matches in matchspecs!


The power of patterns

Here at :tea: :tm: Matcha Incorporated :copyright:, we’re big fans of the match operator, =, which performs a pattern match.

When you’re first learning a BEAM VM language, it’s easy to think of it as just the variable binding operator. After all, these are the semantics of = most of us are familiar with coming from other languages, and the pattern variable on the left side will always match the right side, and bind the entirety of the right side to variable:

variable = {:some, %{complicated: [:data, "structure"]}}
variable
#=> {:some, %{complicated: [:data, "structure"]}}

Over time, we learn to appreciate that pattern matching can also perform destructuring as well as variable binding:

{:some, data} = variable
data
#=> %{complicated: [:data, "structure"]}

It provides a natural syntax for multiple assignment, even from deeply nested values:

{:some, %{complicated: [type, specifics]}} = variable
{type, specifics}
#=> {:data, "structure"}

But what’s really wild is that you can nest matches inside matches, to both bind variables at a shallower level of nesting, and match on data at a deeper level of nesting:

{:some, data = %{complicated: [:data, specifics]}} = variable
{data, specifics}
#=> {%{complicated: [:data, "structure"]}, "structure"}

This is often used in combination with guard expressions, so we can extract a set of data when its specifics satisfy a guard:

case variable do
  {:some, data = %{complicated: [:data, specifics]}} when is_binary(specifics)
    -> data
end
#=> %{complicated: [:data, "structure"]}

Matches inside matches inside matchspecs

Obviously, if you can do this in Elixir, I want to support it in Matcha. Extracting general data from an object in an :ets table where its specifics satisfy a certain guard is a common use-case. However, if you’ve ever spent any time studying the matchspec grammar, first off: I’m sorry.

Secondly, you might have realized that there is no direct analog of nested matching in them! Specifically, the anatomy of a MatchHeadPart does not allow you to both describe binding a term to a variable, and performing a destructuring operation on that term (that may permit a binding with a deeper nested term).

Put another way, you cannot both bind to and destructure a term in a matchspec!

To clarify, for some:

Matcha.spec do
  pattern -> ...
end

This pattern is representable:

{:some, data} -> ...

And this pattern is representable:

{:some, %{complicated: [:data, specifics]}} -> ...

But both at once are not:

{:some, data = %{complicated: [:data, specifics]}} -> ...

Today, Matcha reflects this reality:

Matcha.spec do
  {:some, data = %{complicated: [:data, specifics]}} when is_binary(specifics) -> data
end
#!> ** (Matcha.Rewrite.Error) found problems rewriting code into a match spec: when binding variables
#!> 
#!>  ({:some, data = %{complicated: [:data, specifics]}} when is_binary(specifics) -> data)
#!>    error: cannot match `data` to `%{complicated: [:data, specifics]}`: cannot use the match operator in match spec heads, except to re-assign variables to each other
#!>     (matcha 0.1.10) lib/matcha/rewrite.ex:472: Matcha.Rewrite.raise_match_in_match_error!/3
#!>     (elixir 1.15.6) lib/macro.ex:667: Macro.do_traverse/4
#!>     (matcha 0.1.10) lib/matcha/rewrite.ex:445: Matcha.Rewrite.do_rewrite_bindings/2
#!>     (matcha 0.1.10) lib/matcha/rewrite.ex:323: Matcha.Rewrite.rewrite_clause/2
#!>     (elixir 1.15.6) lib/enum.ex:1693: Enum."-map/2-lists^map/1-1-"/2
#!>     (matcha 0.1.10) lib/matcha/rewrite.ex:210: Matcha.Rewrite.spec/2
#!>     (matcha 0.1.10) lib/matcha/rewrite.ex:199: Matcha.Rewrite.build_spec/3
#!>     (matcha 0.1.10) expanding macro: Matcha.spec/1

Look to the Guards

Is there a way forward from this? Nope, not really.

At least, not until recently.

  • Since before I started working on Matcha, matchspec guards supported most, but not all, BIFs that are allowed in guards.
  • OTP 25 introduced support for using the BIFs :erlang.is_map_key/2 and :erlang.map_get/2 in matchspecs.
  • At my behest, in OTP 26 @jhogberg graciously added support for all the other missing guard BIFs to matchspecs, mostly critically including :erlang.tuple_size/1, and added tests to ensure that all guard-safe BIFs are also allowed in matchspec guards going forwards!

Fake it 'til you match it

With the added support for the guard :erlang.tuple_size/1 in matchspecs, we finally have all the tools we need to support both binding and destructuring on the same nested term. All it would take is re-implementing destructuring of composite terms into guard checks in the Matcha compiler!

Take, for example, the matchspec:

Matcha.spec do
  {:some, data = %{complicated: [:data, specifics]}}
    when is_binary(specifics) 
      -> data
end

We can’t literally describe the nested match data = %{complicated: [:data, specifics]} in today’s match specification grammar.

But what we can do is… grisly, but semantically equivalent:

Matcha.spec do
  {:some, data}
    when :erlang.is_map(data) and :erlang.is_map_key(:complicated, data)
     and :erlang.is_list(:erlang.map_get(:complicated, data))
     and :erlang.length(:erlang.map_get(:complicated, data)) == 2
     and :erlang.hd(:erlang.map_get(:complicated, data)) == :data
     and is_binary(:erlang.hd(:erlang.tail(:erlang.map_get(:complicated, data)))) 
      -> data
end

Rather than ask you to type all that out, it should be possible to develop a destructuring-to-guards transpiler in the Matcha compiler and do it for you! We finally have enough guards available in matchspecs that I think we can convert any arbitrary destructuring of composite terms in an Elixir match pattern into a mess of matchspec-supported :erlang guards:

On paper, this is very cool. Off paper, this is very cool but requires a whole heck of a lot of work to the compiler. So I’ll be playing with this premise more over the next few months, when I find time!

6 Likes