I have been looking at the available Elixir/Erlang libraries for working with XML and cannot seem to find something that can do everything Nokogiri (Ruby) does.
I have come down to sweet_xml for parsing xml and erlsom for validating xml. I could use xml_builder to build xml or even erlsom. I have not found a library other than erlsom that does all three- parsing, building and validating. I am also an Elixir newbie so I could be missing something.
What would you all suggest? Using a combination of libraries as mentioned above or is there something else out there I’m missing?
I had the similar issue. I need to parse complicated XML into many structs with hierarchy (to provide good abstraction and functions on them), and also need to update XML (which is actually create new copy with changes, you know)
I’m using sweet_xml for parsing to hierarchy, but I think I have to deal with underlying erlang library anyway - probably xmerl format instead of erlsom, since xmerl looks like the built-in app in Erlang and probably be accepted by more libraries.
But yes, I miss Nokogiri style DOM manipulation and easy output (to_xml).
If you don’t need validation and the XML can all fit in memory then the Meeseeks library has a FANSTATIC query interface into XML that puts anything else I’ve used yet to shame.
Yes, Meeseeks is very nice! I recently used it for a small scraper and it was faster and easier to use than anything else I looked at.
Note that it’s fast because it uses a NIF to wrap the Rust library html5ever. I installed Rust (using asdf), added Meeseeks to my deps and it just worked. But it’s good to be aware of the extra dependency.
This is also the case with Scrape and it actually suffers some compilation issues because of having to compile rust in a rather brittle way. The only way I could make it work was to actually go into the deps manually and run a cargo command.
So… it’s been some time since the last post here and I have some XMLs to process both directions, with XSD support, and I have problem with external resources there. The XSD uses types definitions referenced as external URL (https:// …) when trying to process the XSD (using :xmerl_xsd.process_schema/1) I get :enoent error related to those external resources.
Is there a way to make xmerl[_xsd] fetch them as needed?
Thank you, although I am wondering whether that’s feasible. I mean w/o reimplementing the thing. Is there a place where I could simply provide the fun?