Hereby I’d like to announce a new library I’ve been working on over the last few months: XPeg.
XPeg is a pure Elixir pattern matching library. It provides macros to compile PEG grammars to an Elixir function which will parse a string and capture selected parts of the input. PEGs are not unlike regular expressions, but offer more power and flexibility, and have less ambiguities. More about PEGs on wikipedia.
Some use cases where XPeg is useful are configuration or data file parsers, robust protocol implementations, input validation, lexing of programming languages or domain specific languages.
XPeg allows mixing of the grammar definition with Elixir functions that act on the matched data, allowing for powerful and concise AST generating parsers.
Example
Below is a simple grammar definition that parses a comma separated list of key/value pairs into a list of tuples:
p = Xpeg.peg :dict do
:dict <- :pair * star("," * :pair) * !1
:pair <- :word * "=" * :number * fn [a,b | cs] -> [{b,a} | cs] end
:word <- cap(+{'a'..'z'})
:number <- cap(+{'0'..'9'}) * fn [v | cs] -> [String.to_integer(v) | cs] end
end
This grammar consists of the following rules:
- The top level rule
:dict
matches one:pair
, followed by zero-or-more instances of a,
followed by a:pair
- The
:pair
rule matches a:word
followed by an=
and a:number
- The
:word
rule matches one-or-more characters from the set{'a'..'z'}
- The
:number
rule matches one-or-more characters from the set{'0'..'9'}
Some rules are followed by elixir functions that convert or transform the captured data at parse time, resulting in the required AST syntax.
The grammar can be matched against the subject string using the Xpeg.match()
function:
Xpeg.match(p, "grass=4,horse=1,star=2")
resulting in the following output:
[{"star", 2}, {"horse", 1}, {"grass", 4}]
Below are some links to more elaborate examples from the GitHub repository: