I’m using the Floki HTML parser as part of an application. I use it for rewriting custom attributes on <a>
tags in HTML fragments. For that purpose it works great.
I recently had the need in my app to support something like Phoenix functional components and I borrowed the <.tag_name>
format.
While I only supported self-contained “component tags” all was well.
However, I later modified my parser to support container component tags <.tag_name>...</.tag_name>
and ended up with a problem that I have traced back to the Floki pre-processing stage.
In short, Floki (or MochiWeb) is eating the closing </.tag_name>
so that it isn’t in the output when it hits my parser (which then barfs on the lack of a closing tag).
Using Floki on a self-contained “component tag” is fine: <li><.foo />Bar</li>
iex(7)> Floki.parse_fragment("<li><.foo />Bar</li>")
{:ok, [{"li", [], ["<.foo />Bar"]}]}
But on a container tag it does not work: <li><.foo>Bar</.foo>Baz</li>
iex(8)> Floki.parse_fragment("<li><.foo>Bar</.foo>Baz</li>")
{:ok, [{"li", [], ["<.foo>Bar", "Baz"]}]}
iex(9)>
The closing <./foo>
is not in the output.
I mean, <.
is not legal HTML so I have every sympathy with Floki for not doing what I want. But it seems pretty close and I wonder if it could do it. Unfortunately it looks like Floki is delegating parsing to MochiWeb which is a mass of (to me) incomprehensible Erlang code.
Anyone have any hope for me?