christhekeele

christhekeele

🍵 Matcha - first-class match specifications for Elixir

Officially announcing Matcha!


Matcha

Available on GitHub and hex.pm, Matcha is a library for composing and using match specifications in Elixir. It’s intended to help you perform really fast, selective :ets queries, and refine how you can study the function calls in your running programs.

Synopsis

A really powerful but difficult-to-use feature of the BEAM VM are match specifications. I’ve wrapped them with Matcha to make them much more accessible to Elixir programmers, and am working hard to have great documentation on how to use them where they shine:

The Problem

The thing Matcha works to solve: match specifications are already kind of a a chore to use from erlang code. This is compounded when trying to use them from Elixir, since they really just encode an informal erlang AST. Alchemists have to be familiar with erlang syntax and do a lot of mental context switching to tap into their benefits.

Additionally, the APIs and documentation around how to build and use them in :ets and tracing is very dense. Some great guides exist out there, but I want to marry an easier-to-use-API with rich, first-class Elixir documentation and livebooks to make these topics more approachable.

The Solution

I get into the why, how, and when of match specifications as they exist today in my ElixirConf 2022 talk. I also tease a library to help with them—and have been teasing it throughout its several years of intermittent development—and here we finally are.

Matcha employs Elixir’s macro system and compiler to convert familiar Elixir pattern matches into this oblique format, with rich compile-time validation, and exposes some nicer APIs for working with the resulting match specs.

The Future

In the holiday work lull, I’m able to find a little more time to pick up development again. My hope is to treat this thread as a bit of a devlog for now, and a changelog after a v0.2.0 release with a more stable API. Hopefully I can entice some of you to check things out, and even help me improve it—especially the documentation.


Follow this thread, or the forum’s #matcha topic, to stay abreast of further developments!

Most Liked Responses

christhekeele

christhekeele

DEVLOG.md 2023-03-24

It’s been a while! I thought I’d provide few updates, and talk about the roadmap a little.

Updates on available the latest branch

I’ve not yet cut out v0.1.8, as I want to finish up two bugfixes first. However:

Features

  • Mirror OTP 25’s support for binary_part/2, binary_part/3, byte_size/1 in match specs
  • Got the OTP team to support ceil/2, floor/3, is_boolean/1, is_function/2, tuple_size/1 in match specs, landing in the upcoming OTP 26 release (:bangbang:)
    • already added support to Matcha for when OTP 26 is released
    • this should mean that Matcha supports all guard-safe functions in match specs written in both Elixir and Erlang!
      • except erlang’s is_record/2, which Elixir has a work-around for in OTP 26, and is a longstanding issue in erlang

Docs

  • Started livebook guides and cheatsheets for adopting Matcha
    • They are intended to be a “converting my project’s match specs to use Matcha” tutorial
    • They’re still in progress as I finalize the “high level” APIs, but early feedback is welcome
  • Added a CONTRIBUTORS.md
  • Many more functions documented

Fixes

  • Prevent Matcha from emitting warnings when not using :mnesia
  • Fix remaining issues compiling Kernel.and/2 and Kernel.or/2 when used in match spec bodies
  • Fix remaining issues compiling Kernel.is_exception/{1,2} and Kernel.is_struct/{1,2} when used in match spec bodies
  • This is… pretty much all of the known issues with the compiler resolved, except for the aforementioned remaining two I’m blocking a release on.
    • Matcha has a 1:2 code/test LoC ratio, and this is mostly centered around discovering edge cases in the compiler today, so I’m feeling pretty darn good about it!

Tests

  • Many more codepaths tested (mostly around edge cases in the compiler to discover resolved bugs)

Roadmap

v0.1.8

  • There are those two known edge-cases with spec compilation I intend to address before releasing the above progress.

v0.2.0

  • This release is when I’m declaring Matcha “ready to use”!
  • The main obstacle is fleshing out documentation. I’ve spent more time on guides than module/fn docs, and it shows.
    • one of Matcha’s most ambitious goals is explaining how/when to use match specs in an approachable fashion, so I’m happy to dwell on this.
  • I’ve also spent more time on the Elixir → ms compiler than higher-level APIs to use specs built by the compiler, so tests/documentation need to be fleshed out as these settle.

v0.3.0

  • I intend to rework Matcha’s tracing APIs to support even more use-cases in this release
  • I will end up ditching the :recon dependency for a custom implementation, for a few reasons
    • I’d like to keep Matcha dependency-less
    • :recon only supports tracing function calls safely, I’d like to support tracing send/receive events as well
    • I intend to apply :recon’s safety heuristics to these things other than function calls, so will ape a lot of the great work done there

v1.0.0

  • This is still the release where any breaking change will imply a major version bump
    • I only anticipate hesitating to publish this post v0.3.0 if the high-level trace and table APIs prove to need a little more work post-release
  • I want full documentation/test/typespec coverage before I make this release

That’s the hot tea :teacup_without_handle: on Matcha, thanks for reading!

christhekeele

christhekeele

DEVLOG.md 2023-10-20

This update discusses new syntactic support for nested matches I want to experiment with for Matcha!


I started developing Matcha.filter/1 because I needed it for something I wanted to build for SpawnFest. Sadly, while developing the unannounced library I wanted to release first and use in the competition, I’ve ran into a limitation of matchspec’s expressivity I’ve long aspired to overcome. I’ve suspected for a while that it’s solvable, but solving it properly in Matcha will take too much time away from the development of the depending library I’d planned on using in the competition—it’s a hard blocker. I may submit something else, though!

My consolation prize is that my remorse has fueled me to think on the problem of nested matches more, and recent changes to the Matcha compiler to support filters should give me what I need to implement it. Let’s dive into nested matches in matchspecs!


The power of patterns

Here at :tea: :tm: Matcha Incorporated :copyright:, we’re big fans of the match operator, =, which performs a pattern match.

When you’re first learning a BEAM VM language, it’s easy to think of it as just the variable binding operator. After all, these are the semantics of = most of us are familiar with coming from other languages, and the pattern variable on the left side will always match the right side, and bind the entirety of the right side to variable:

variable = {:some, %{complicated: [:data, "structure"]}}
variable
#=> {:some, %{complicated: [:data, "structure"]}}

Over time, we learn to appreciate that pattern matching can also perform destructuring as well as variable binding:

{:some, data} = variable
data
#=> %{complicated: [:data, "structure"]}

It provides a natural syntax for multiple assignment, even from deeply nested values:

{:some, %{complicated: [type, specifics]}} = variable
{type, specifics}
#=> {:data, "structure"}

But what’s really wild is that you can nest matches inside matches, to both bind variables at a shallower level of nesting, and match on data at a deeper level of nesting:

{:some, data = %{complicated: [:data, specifics]}} = variable
{data, specifics}
#=> {%{complicated: [:data, "structure"]}, "structure"}

This is often used in combination with guard expressions, so we can extract a set of data when its specifics satisfy a guard:

case variable do
  {:some, data = %{complicated: [:data, specifics]}} when is_binary(specifics)
    -> data
end
#=> %{complicated: [:data, "structure"]}

Matches inside matches inside matchspecs

Obviously, if you can do this in Elixir, I want to support it in Matcha. Extracting general data from an object in an :ets table where its specifics satisfy a certain guard is a common use-case. However, if you’ve ever spent any time studying the matchspec grammar, first off: I’m sorry.

Secondly, you might have realized that there is no direct analog of nested matching in them! Specifically, the anatomy of a MatchHeadPart does not allow you to both describe binding a term to a variable, and performing a destructuring operation on that term (that may permit a binding with a deeper nested term).

Put another way, you cannot both bind to and destructure a term in a matchspec!

To clarify, for some:

Matcha.spec do
  pattern -> ...
end

This pattern is representable:

{:some, data} -> ...

And this pattern is representable:

{:some, %{complicated: [:data, specifics]}} -> ...

But both at once are not:

{:some, data = %{complicated: [:data, specifics]}} -> ...

Today, Matcha reflects this reality:

Matcha.spec do
  {:some, data = %{complicated: [:data, specifics]}} when is_binary(specifics) -> data
end
#!> ** (Matcha.Rewrite.Error) found problems rewriting code into a match spec: when binding variables
#!> 
#!>  ({:some, data = %{complicated: [:data, specifics]}} when is_binary(specifics) -> data)
#!>    error: cannot match `data` to `%{complicated: [:data, specifics]}`: cannot use the match operator in match spec heads, except to re-assign variables to each other
#!>     (matcha 0.1.10) lib/matcha/rewrite.ex:472: Matcha.Rewrite.raise_match_in_match_error!/3
#!>     (elixir 1.15.6) lib/macro.ex:667: Macro.do_traverse/4
#!>     (matcha 0.1.10) lib/matcha/rewrite.ex:445: Matcha.Rewrite.do_rewrite_bindings/2
#!>     (matcha 0.1.10) lib/matcha/rewrite.ex:323: Matcha.Rewrite.rewrite_clause/2
#!>     (elixir 1.15.6) lib/enum.ex:1693: Enum."-map/2-lists^map/1-1-"/2
#!>     (matcha 0.1.10) lib/matcha/rewrite.ex:210: Matcha.Rewrite.spec/2
#!>     (matcha 0.1.10) lib/matcha/rewrite.ex:199: Matcha.Rewrite.build_spec/3
#!>     (matcha 0.1.10) expanding macro: Matcha.spec/1

Look to the Guards

Is there a way forward from this? Nope, not really.

At least, not until recently.

  • Since before I started working on Matcha, matchspec guards supported most, but not all, BIFs that are allowed in guards.
  • OTP 25 introduced support for using the BIFs :erlang.is_map_key/2 and :erlang.map_get/2 in matchspecs.
  • At my behest, in OTP 26 @jhogberg graciously added support for all the other missing guard BIFs to matchspecs, mostly critically including :erlang.tuple_size/1, and added tests to ensure that all guard-safe BIFs are also allowed in matchspec guards going forwards!

Fake it 'til you match it

With the added support for the guard :erlang.tuple_size/1 in matchspecs, we finally have all the tools we need to support both binding and destructuring on the same nested term. All it would take is re-implementing destructuring of composite terms into guard checks in the Matcha compiler!

Take, for example, the matchspec:

Matcha.spec do
  {:some, data = %{complicated: [:data, specifics]}}
    when is_binary(specifics) 
      -> data
end

We can’t literally describe the nested match data = %{complicated: [:data, specifics]} in today’s match specification grammar.

But what we can do is… grisly, but semantically equivalent:

Matcha.spec do
  {:some, data}
    when :erlang.is_map(data) and :erlang.is_map_key(:complicated, data)
     and :erlang.is_list(:erlang.map_get(:complicated, data))
     and :erlang.length(:erlang.map_get(:complicated, data)) == 2
     and :erlang.hd(:erlang.map_get(:complicated, data)) == :data
     and is_binary(:erlang.hd(:erlang.tail(:erlang.map_get(:complicated, data)))) 
      -> data
end

Rather than ask you to type all that out, it should be possible to develop a destructuring-to-guards transpiler in the Matcha compiler and do it for you! We finally have enough guards available in matchspecs that I think we can convert any arbitrary destructuring of composite terms in an Elixir match pattern into a mess of matchspec-supported :erlang guards:

On paper, this is very cool. Off paper, this is very cool but requires a whole heck of a lot of work to the compiler. So I’ll be playing with this premise more over the next few months, when I find time!

christhekeele

christhekeele

DEVLOG.md 2023-10-04

I thought I’d share a concept I’m finally playing with for Matcha: first-class filters!

For the sake of these snippets, assume we have:

krillin = %{name: "Krillin", age: 28, power: 1_770}
goku = %{name: "Son Goku", age: 27, power: 3_000_000}
saiyans = [krillin, goku]

:tea: Matcha Filters

This is actually one thing that prompted me to begin investigating building Matcha (:stopwatch: :eyes: over four years ago?!), something in-between a first-class match pattern and a full match specification.

Matcha Specs

For context, a Matcha.Spec is similar to a deferred case statement you can pass around as a variable, and match against when you want instead of immediately. It has native support in :ets via the :ets.select_* APIs, and Matcha makes it easy to use them against arbitrary in-memory data as well (without the nice performance you get in :ets applications, a little slower than an equivalent Enum.map and more limited).

As I dig into in my matchspec talk, a match spec is essentially a data structure that looks like this:

match_spec = [
  {pattern, guards, body},
  {pattern, guards, body},
  # ...
]

This mirrors an equivalent case statement, but without an immediate match target:

case target do
  pattern when guards -> body
  pattern when guards -> body
  # ...
end

You can execute a deferred match specification against an in-memory target like so:

Matcha.Spec.call(match_spec, target)

Most of the cool stuff with match specs comes from the fact that you can hand this match specification to :ets.select_* (with match_spec.source) and it will test every object in a table against your specification, and for any successful match, return a transformed result, much more efficiently than loading the entire table into a process’s memory and doing this all yourself with Enum.filter + Enum.map or for comprehensions.

Matcha Patterns

Matcha also has support for a Matcha.Pattern construct. This acts like just a stand-alone pattern part of a match spec, and you can think of it as a deferred pattern match/destructuring. That is, if you have code like:

%{name: name, age: 27} = target

Then you will get a MatchError if target is not a map with the provided keys, and if target’s age does not match 27 exactly; otherwise it captures the name of only 27-year-olds in a variable called name. Matcha lets you build deferred matches like so:

match_pattern = Matcha.pattern(%{name: name, age: 27})

Matcha.Pattern.match?(match_pattern, krillin)
#=> false
Matcha.Pattern.match?(match_pattern, goku)
#=> true
Matcha.Pattern.matched_variables(match_pattern, goku)
#=> %{name: "Son Goku"}

Matcha.Pattern.matches(match_pattern, saiyans)
# => [%{name: "Son Goku", age: 27, power: 3000000}]
Matcha.Pattern.variable_matches(match_pattern, saiyans)
# => [%{name: "Son Goku"}]

There are, of course, better ways to do this in your Elixir programs with in-memory data. But :ets also lets you leverage match patterns against an entire table at once, returning only objects that match the pattern, via the :ets.match_* APIs (providing them the match_pattern.source). Matcha supports this :ets usecase with Matcha.Patterns.

Matcha Filters

Running with the above example, what if we wanted to only match people on inexact criteria? Say, people who had more than some exact quality?

Match patterns alone aren’t expressive enough to do this. Match specifications are, but they use a special syntax to do it, that we can’t really convert Elixir code into—one of the key goals of Matcha.

What we really want is to support a first-class deferred pattern when guards construct, and that’s exactly what “filters” are intended to be:

match_filter = Matcha.filter(%{name: name, power: power} when power > 9_000)

Matcha.Filter.match?(match_filter, krillin)
#=> false
Matcha.Filter.match?(match_filter, goku)
#=> true
Matcha.Filter.matched_variables(match_pattern, goku)
#=> %{name: "Son Goku", power: 3000000}

Matcha.Filter.matches(match_pattern, saiyans)
# => [%{name: "Son Goku", age: 27, power: 3000000}]
Matcha.Filter.variable_matches(match_pattern, saiyans)
# => [%{name: "Son Goku", power: 3000000}]

The overarching goal is to:

  • Provide an API that allows using all features of :ets match specs, including :"$$" and :"$_", from syntactically valid Elixir code
  • Provide a new mode of querying :ets tables tersely when re-mapping matched objects is not a requirement, akin to a hypothetical set of :ets.filter_* functions
  • Further my Macro Crimes :tm: in my two personal projects where I convert Elixir code into both Elixir functions, and compatible :ets queries, where having first-class :ets-compatible functions heads is a boon

Higher-level Query Support

The main reasons why I haven’t much popularized the Matcha.Pattern APIs and the higher-level Matcha.Table APIs are because:

  • The Matcha.Filter support for guards is sufficiently more powerful I may deprecate Matcha.Pattern or downplay it but still support it for :ets.match_* equivalents
  • The Matcha.Table APIs may get more powerful variants that know how to navigate either patterns or filters agnostically, and I don’t want to commit to them just yet

As an example, in tandem with my Matcha.Filter experiments, I have the following code snippet working mostly as expected:

over_nine_thousand = Matcha.Table.query( 
  {_id, %{name: name, power: power}} when power > 9_000
) 
alias Matcha.Table.Query

for saiyan <- Query.where(table, over_nine_thousand) do
  IO.puts("Scanning #{saiyan.name}, #{saiyan.age} years old...")
end
for %{name: threat} <- Query.select(table, over_nine_thousand), threat == "Son Goku" do
  IO.puts("#{threat}'s power level is OVER 9_000!")
end

All still a rough work in progress, but interested in early feedback!

- Much :tea:, Chris

Where Next?

Popular in Announcing Top

seancribbs
Today I released a new dialyzer Mix task as the dialyzex package! At the time we started writing this task, the existing dialyzer integra...
New
mtrudel
Bandit is an HTTP server for Plug and WebSock apps. Bandit is written entirely in Elixir and is built atop Thousand Island. It can serve...
New
oltarasenko
Dear Elixir community, After a year of development, bug fixes, and improvements, we are proudly ready to share the release of Crawly 0.1...
New
grych
Hi folks, Few months ago I have announced the proof-of-concept of the library to manipulate the browsers DOM objects directly from Elixi...
639 52341 488
New
blatyo
The best overview for how things are tied together is this presentation. Modules and functions are pretty well documented at this point, ...
New
wojtekmach
Hey everyone! Req is an HTTP client for Elixir that I’ve been working on for quite some time. There is already a lot of HTTP clients out...
New
ahamez
Hi everyone, I’ve been working on this protobuf library for 3 years. We use it in the company I work for, EasyMile, to communicate with ...
New
josevalim
Hello everyone, We have just released NimbleCSV which is a small and fast CSV parsing library for Elixir. It allows developers to define...
New
anshuman23
Hello all, I have been working on my proposed project called Tensorflex as part of Google Summer of Code 2018.. Tensorflex can be used f...
New
wfgilman
I’ve cleaned up and open sourced three financial libraries I was using for my company. They are bindings for the APIs of these three comp...
New

Other popular topics Top

malloryerik
Hi, this is for people who, like me, have had some friction using .html.heex templates in VSCode. The solution seems to be, in a hyphena...
New
TunkShif
This post is an instruction guide to help you setup your Neovim for Elixir development from scratch. It includes general information on h...
274 41539 114
New
albydarned
Hello all! I am typing this post from my new MacBook Pro with the M1 chip. I’m loving it so far, and will probably use it as my daily dr...
New
skosch
To my knowledge, put_in, Map.update etc. all have the one limitation of not automatically creating intermediate keys when needed (for exa...
New
jerry
Good day to you all. I have been struggling to get a query involving like and ilike to work. Can anyone assist me on this, please? pro...
New
saif
Hello everyone, Long time lurker first time poster here. I’ve recently begun working on Elixir full-time again! :raised_hands: It’s been...
New
PeterCarter
There are pre-rolled solutions for other frameworks that do work. However, Phoenix does not seem to have these. Have people had good expe...
New
dogweather
I wrote this comment on r/haskell, and it’s not popular there. :wink: But I think I’m on to something… Haskell reminds me of Java, and e...
New
jononomo
For some reason my phoenix channels are working for me in my local dev environment, but as soon as I deploy via Docker, I get a 403 error...
New
lanycrost
Hi everyone! I need implement if…else if…else condition from my elixir code, and anymore of this control flow structures not work proper...
New

We're in Beta

About us Mission Statement