FastRSS
Parse RSS feeds very quickly:
- This is rust NIF built using rustler
- Uses the RSS rust crate to do the actual RSS parsing
Speed
Currently this is much faster than most of the pure elixir/erlang packages out there that I tested.
In benchmarks there are speed improvements anywhere between 6.12x - 50.09x over the next fastest package (feeder_ex) that was tested.
Compared to the slowest elixir options tested (feed_raptor, elixir_feed_parser), FastRSS was sometimes 259.91x faster and used 5,412,308.17x less memory (0.00156 MB vs 8423.70 MB).
See full benchmarks
Usage
There is only one function it takes an RSS string and outputs an {:ok, map()}
with string keys.
iex(1)> {:ok, map_of_rss} = FastRSS.parse("...rss_feed_string...")
iex(2)> Map.keys(map_of_rss)
["categories", "cloud", "copyright", "description", "docs", "dublin_core_ext",
"extensions", "generator", "image", "items", "itunes_ext", "language",
"last_build_date", "link", "managing_editor", "namespaces", "pub_date",
"rating", "skip_days", "skip_hours", "syndication_ext", "text_input", "title",
"ttl", "webmaster"]
The docs can be found at https://hexdocs.pm/fast_rss.
Supported Feeds
Reading from the following RSS versions is supported:
- RSS 0.90
- RSS 0.91
- RSS 0.92
- RSS 1.0
- RSS 2.0
- iTunes
- Dublin Core
Links
GitHub: https://github.com/avencera/fast_rss
Hex: https://hex.pm/packages/fast_rss
HexDocs: https://hexdocs.pm/fast_rss/FastRSS.html
Benchmarks: https://github.com/avencera/fast_rss#benchmark
Why?
I needed to parse some podcast RSS feeds from iTunes. At first I tried elixir_feed_parser
but I noticed it was a bit slow on some of the larger feeds. Recently, I have also been enjoying working with Rust. I remembered that Rustler was a thing, and I always thought it was interesting. But I never had a chance to use it.
I thought trying to make a Rust NIF to parse RSS feeds would be a fun learning exercise. It turned out to be not be too much effort (thanks @hansihe and @scrogson). The hardest problem I had was dealing with some annoying problems with deploying on alpine.
I wasn’t planning on releasing this as a hex package until I did some benchmarks. The first version was pretty dumb, I would pass the parsed xml data from Rust as a stringified json and decode it on the elixir side using Jason, so I wasn’t expecting much in terms of performance. But I was surprised to see it being between 16x-42x faster. That’s when I decided to release it as a hex package.
Since then I’ve made it a bit smarter (I encode the Rust struct directly into an elixir map). And I added some other packages to the benchmarks. I’m sure it can still be made much smarter.
Of all the other packages I tested, FeederEx was the fasted pure elixir/erlang package. But FastRSS is still 6.12x - 50.09x faster.