Best way to build a parser

In my (pretty limited) experience, I would go with:

  • leex and yecc if your text is rigidly structured and follows some sore of grammar (e.g. it’s a DSL, config file, source code, etc.)
  • NimbleParsec if your text is semi-structured (e.g. records where the schema wasn’t enforced and there are deviations) and you want to extract structured information from it

Basically, if your “rules” have a bunch of “that’s not always true” cases (e.g. "an email address usually follows the person’s name, but sometimes there’s a phone number instead’) I would go with NimbleParsec because it’s easier to manage the complexity (by combining sub-parsers).

Of course, you can also handle the variety of corner cases in the grammar given to leex and yecc, but in my experience the grammar size explodes pretty quickly and makes it challenging to keep in your head.

Also, if your case is simple you could get away with using just Elixir’s pattern matching on strings ("foo " <> rest_of_string"), here’s a description of the idea https://pragdave.me/blog/2014/02/12/pattern-matching-and-parsing.html

9 Likes