I’ve been digging more into the Registry module and I’m working on the Registry.select/2 function in particular. To re-state the docs, the 2nd argument is a list in 3 parts, in the shape of [{match_pattern, guards, body}]
The body is something like a template, so it makes sense that it is bound tightly with the match_pattern. So the only variables you have available are the ones that were assigned via a match. This is pretty similar to some parts of regular expressions.
For example, following the “SELECT ALL” example from the docs:
However, one thing confuses me: why do the tuples have an extra pair of curly braces? I would expect the body to look like this instead: [{:"$1", :"$2", :"$3"}] – just a tuple. But this results in an ArgumentError:
iex> Registry.select(Registry.SelectAllTest, [{{:"$1", :"$2", :"$3"}, [], [{:"$1", :"$2", :"$3"}]}])
** (ArgumentError) errors were found at the given arguments:
* 2nd argument: not a valid match specification
The extra set of braces are not required if we want to format the results as a map, for example:
The TL;DR is that in this context, tuples need to be “escaped” by being double-wrapped to mean “literal tuples” to avoid conflicts with “function call” syntax.
The full explanation dives into the nuances of the “match specification” grammar that this Registry feature relies on, that I’ve studied a lot.
Match Specification Grammar
The Registry.select/2 function accepts an erlang match specification, which is effectively a special-purpose AST for representing erlang code that can be executed very efficiently against things like ETS tables and tracing messages. This erlang-ish AST is composed of just tuples, lists, and atoms.
In your match specifications, you can invoke the functions allowed in erlang guards, in both your “guards” and “bodies” of the spec. The “function calls in guards and bodies” part of the grammar is called a “MatchCondition” if used as a guard, or a “ConditionExpression” if nested in a guard or used in a body, where a literal tuple is used to represent a call to a function, with optional arguments:
This means you can make a call to Kernel.is_integer/1 in your match specifications by writing {:is_integer, 2.1}.
There is a similar syntax to make “fake” function calls in bodies, that drive special behaviour when using a match specification for tracing, that also uses tuples-as-calls:
This syntax conflicts with tuple literals, so literal tuples must be “escaped” where function calls are allowed (in “guards” and “bodies”) by double-wrapping them:
So in a way, tuple literals in Elixir would also overlap with the Elixir AST’s three-tuple “local call” syntax, but the tokenizer “escapes” them for us before we get to the AST by wrapping them as a local call, to deal with them specially later, during AST expansion. Erlang matchspecs don’t really get the same tokenizing pass to save us from the ambiguity, though, since they are already in-memory AST data structures.
Matcha lets you write your match specs as pure Elixir; double-wraps tuples for you in the correct contexts; and does other busy-work like handling macro expansion, turning Elixir guards into erlang guards, emitting correct Elixir compiler warnings for unused variables in your matchspec code, and all those nice things to make writing match specifications feel like writing first-class Elixir.
Because Registry:ets table entries are always three-tuples ({key, pid, value}), this is equivalent to a “SELECT ALL” operation.
An even more intuitive approach to matching these entries would be to just to match on the expected three-tuple structure, but return the entire match. Matchspecs have a special shorthand :"$_" syntax for this, which Matcha is aware of:
spec = Matcha.spec do
entry = {key, pid, value} -> entry
end
# warning: variable "key" is unused (if the variable is not meant to be used, prefix it with an underscore)
# warning: variable "pid" is unused (if the variable is not meant to be used, prefix it with an underscore)
# warning: variable "value" is unused (if the variable is not meant to be used, prefix it with an underscore)
#=> #Matcha.Spec<[{{:"$1", :"$2", :"$3"}, [], [:"$_"]}]>
These are nice Elixir warnings about our code we could resolve with underscored variable names, like:
spec = Matcha.spec do
entry = {_key, _pid, _value} -> entry
end
#=> #Matcha.Spec<[{{:"$1", :"$2", :"$3"}, [], [:"$_"]}]>
I talk a bit about why matchspecs are the way they are, why :ets exists, why only some functions are allowed in guards, and how to tackle reading that giant blob of grammar, in this presentation. I explicitly call out Registry as the only place they are used in Elixir APIs today