ModestEx - Pipeable transformations on html strings (with CSS selectors)

Hello!

I just published my first draft of ModestEx, a Elixir/Erlang binding to lexborisov’s Modest library.

Modest is a fast HTML renderer implemented as a pure C99 library with no outside dependencies.

ModestEx exposes features to do pipeable transformations on html strings with CSS selectors, e.g. find(), prepend(), append(), replace() etc.

iex> ModestEx.find("<p><a>Hello</a> World</p>", "p a")
{:ok, "<a>Hello</a>"}

iex> ModestEx.serialize("<div>Hello<span>World")
{:ok, "<html><head></head><body><div>Hello<span>World</span></div></body></html>"}

The binding is implemented as a C-Node following the excellent example in @Overbryd 's package nodex. If you want to learn how to set up bindings to C/C++, you should definitely check it out.

Before that I experimented with a lot of other Html parser libraries like gumbo-parser, gumbo-query and GQ. I even implemented a binding package called gumbo_query_ex.
However Modest is currently the most active and most promising.

This project is under development!
Stay tuned for more features like ModestEx.remove, ModestEx.prepend, ModestEx.append

Tell me what you think :relaxed:

Best, F34nk

6 Likes

Ah it looks like a C-based variant of Meeseeks except it can also create HTML too, quite nice! :slight_smile:

1 Like

Modest is the full project of which Alexander Borisov’s (lexborisov) myhtml, which also has an Elixir wrapper, is a part. It’s closer to being a C-varient of… maybe Servo? It’s larger in scope than just parsing HTML.

Everything I’ve seen suggests that Modest (and probably ModestEx) will be fast, with low resource usage. That’s pretty great.

1 Like

Thanks guys!

I have a use case where a request reads html from a database and before rendering I need to do some changes on the html string. In my case the html is quite large and the changes are quite extensive.

The idea for ModestEx is to implement a set of features that just do transformations on a html string. Each transformation feature will be done in C.

Something like:

result ModestEx.find("<p><a>Hello</a> World</p>", "p a")
|> ModestEx.attribute("href", "https://elixir-lang.org")

will return:

{:ok,  "<a href=\"https://elixir-lang.org\">Hello</a>"}

… ready to render in a template.

Or you could also serialize it:

ModestEx.serialize(result)

and return:

{:ok,  "<html><head></head><body><a href=\"https://elixir-lang.org\">Hello</a></body></html>"}

Which is already a (more or less) valid page!

Of course, if you need further decoding htmlex, floki or Meeseeks are great!
I see ModestEx as a useful addition to the landscape of html tools in Elixir.

2 Likes

For sure, I have been very hesitant to try adding transformations to Meeseeks, so I’m glad somebody’s doing it. :slight_smile:

1 Like

Oooo, that’s fascinating!

Yeah this looks very useful! :slight_smile:

Heh, yeah it’s quite a thing to tackle. ^.^

1 Like

Hey @mischov @OvermindDL1

I just published ModestEx v0.0.2-dev.

Thanks again for your input. It’s a lot clearer now what the main strength of the library actually is!

I added a new feature ModestEx.get_attribute and ModestEx.set_attribute.

And you can actually pipe them together.

iex> ModestEx.find("<p><a>Hello</a><a>World</a></p>", "p a") |> 
...> ModestEx.set_attribute("href", ["https://elixir-lang.org", "https://google.de"])
["<html><head></head><body><a href=\"https://elixir-lang.org\">Hello</a></body></html>", "<html><head></head><body><a href=\"https://google.de\">World</a></body></html>"]
3 Likes

Wohoo! This is great news :slight_smile:

Will we get Elixir based end 2 end headless browser testing soon? :smiley:

Thanks for the reference and I am delighted to see a binding to Modest.

I have a few use cases where I will get back to it for sure.

1 Like

Hey guys,

the first mayor release is coming soon and I hope to publish it before ElixirConf EU in April.

I also implemented a new CSS selector for :contains(text) in Modest PR#42.

I’ll keep you updated!

1 Like

Release v1.0.0

This release is stable.

def deps do
  [
    {:modest_ex, "~> 1.0.0"}
  ]
end

Total 16 features implemented. See complete feature list.

Total 38 selector patterns implemented (including custom selector :contains(text)).
See complete list of supported CSS selectors.

The package includes all binding code under the folder target/modest_worker.
All Modest related features are implemented in a single C library called modest_html.

This way, all features are tested in a C environment using CMake/CTest with memory tracking enabled using a library called dmt.

Please feel invited to check it out :relaxed:

Best, F34nk

4 Likes