Saxy: A fast, easy-to-use and XML 1.0 compliant SAX parser in Elixir

Yo Elixir fellows!

I have just released Saxy – as the name might have already implied: It is a new XML SAX parser.

What is so great about Saxy?

TL;DR It is fast, easy to use, streamy and compliant to XML 1.0.

Saxy is quite fast. Benchmarking results show that it could parse up to 15.37 MB of binary per second. In comparison to other similar libraries, Saxy is usually 1.4 times faster than Erlsom and 4.46 times faster than xmerl. The speed is particularly noticeable with 4.35 times faster than Erlsom when parsing large and deeply nested XML.

Saxy is easy to use. It emits binary instead of characters list like xmerl and Erlsom, which makes it extremely easy to use. Like Erlsom, Saxy provides a function to export XML documents into “simple form” format.

Saxy supports streaming parsing in a native Elixir way. Saxy accepts passing File.Stream and Stream as the input, which means you are in full control of how the file/binary chunks will be streamed.

Saxy is XML 1.0 compliant. However, no Doctype Definition and external entities are supported at the moment.

I also published a blog post about how the library was built.

Feedbacks, suggestions, PRs are welcome :bowing_man:! Thank you!

References:

  1. Benchmark suite
  2. XML 1.0
  3. Erlsom
15 Likes