Hi all. I’m no stranger to parser combinators (see my long-abandoned pegjs for erlang). But there’s an issue I can’t seem to understand with NimbleParsec.
A very common thing to do in nearly every parser is to define skippable whitespace/blankspace (among many other common reusable combinators).
Example (from my own pegjs parser defnition):
Blankspace
= (WhiteSpace / LineTerminatorSequence / Comment)*
WhiteSpace
= "\t"
/ "\v"
/ "\f"
/ " "
/ "\u00A0"
/ "\uFEFF"
/ Zs
// https://www.compart.com/en/unicode/category/Zs
Zs = [\u0020\u00A0\u1680\u2000-\u200A\u202F\u205F\u3000]
// LineTerminatorSequence and Comment ommitted for brevity
And then this would be used, well everywhere
// A rule is identifier=value
// There can be any number of whitespace in between
Rule
= IdentifierName
Blankspace
(StringLiteral Skippable)?
"="
Blankspace
Expression
EOS
Now, the trouble starts when converting this to NimbleParsec.
The first part is easy:
zs = utf8_char([0x0020, 0x00A0, 0x1680, 0x2000..0x200A, 0x202F, 0x205F, 0x3000])
whitespace_character =
choice([
ascii_char([?\t, ?\v, 32, ?\t]),
utf8_char([0x00A0, 0xFEFF]),
zs
])
blankspace = choice([whitespace_character, line_terminator_sequence]) |> repeat()
But then using it… how?
This will not work:
rule = repeat(ascii_char(not: 32)) |> blankspace
** (CompileError) undefined function blankspace/1
You can wrap it into additional repeat
or optional
but this is extremely redundant and code readability suffers:
## We have already defined blankspace as optional in its own definition
rule = repeat(ascii_char(not: 32)) |> optional(blankspace)
I’ve tried to convert it to a function:
def zs do
utf8_char([0x0020, 0x00A0, 0x1680, 0x2000..0x200A, 0x202F, 0x205F, 0x3000])
end
def whitespace_character do
choice([
# space
ascii_char([?\t, ?\v, 32, ?\t]),
utf8_char([0x00A0, 0xFEFF]),
zs()
])
|> label("whitespace")
end
def blankspace do
choice([whitespace_character()]) |> repeat()
end
rule = repeat(ascii_char(not: 32)) |> blankspace()
** (CompileError) undefined function blankspace/1
I’ve tried converting rule
to a function, but nothing works
So now I’m scratching my head and hoping that the collective wisdom of Elixir Forum will help me