Md - Stream-aware markdown parser with custom syntax setting

Md, a library to parse markdown and markdown-like syntaxes, has been released.

Fully-customizable syntax, blazingly fast (5× compared to earmark,) opinionated. It does not fully support Commonmark (and it’ll never do,) but it covers the needs of the average user + allows the introduction of custom markdown-like syntax with ease.

Parser for Markdown Family

18 Likes

Nice. Does it generate a AST that is compatible with either earmark or floki?

It uses xml_builder for generation, hence it differs from earmark AST (I believe floki leverages the latter.)

OTOH, the traversing/mapping the result to earmark AST should be trivial.

I am using Earmark to parse user input then do a sanitizing run on the resulting AST to remove unwanted artifacts. It seems like with this library I may be able to define the subset that I allow precisely and get rid of the sanitizing run.

Yes, you might explicitly allow whatever you want to be parsed as markdown, including but not limited to the markdown syntax itself. E. g. magnet allows to declare a syntax for @handles and/or #hashtags.

1 Like

I am reading your code, it is very impressive. It is basically a framework to build ad-hoc parsers for any syntax. For example, you can have a syntax highlighter for any program languages. A server side rendering, elixir based syntax highlighter will be very sweet. I long for the day when I could remove prismjs from my js bundle. Keep it up!

1 Like

Thanks, but nope. For that, one should nimble_parsec or any other parser combinator.

I am after another usage: custom simple markup parsers, easily covering needs like Slack in-place markup, user-defined markup, etc.

Using this as a language grammar parser is like parsing HTML with regular expressions :slight_smile:

prismjs is built on regular expressions, and it is good enough for syntax highlighting. Emacs’s syntax highlighters are also basically regular expressions.

There‘s already GitHub - elixir-makeup/makeup: Syntax highlighter for Elixir inspired by Pygments :slight_smile:

1 Like

The problem is that with makeup I need to write some serious code to support a new language. While in prismjs or emacs, I only need to write one file, with ~50 lines, in one hour to get a passable highlighter for a language.

Looks awesome! I enjoyed your blog post too :slight_smile:

1 Like

Is there an easy way to pass alone a custom struct through the parser? Md allows various hooks to hook into the parsing flow; it would be nice to embed and mutate some custom state. For example, I may want to do a word count for certain elements.

I can achieve this goal with walking the AST after the parsing, but that feels awkward.

One might pass State to Md.parse/2 as a second parameter. Starting with v0.7.0 it has payload key.

All the listener functions can return a tuple {:update, new_state} to update the state.

3 Likes