We have just released NimbleCSV which is a small and fast CSV parsing library for Elixir. It allows developers to define their own parsers so we can rely on binary patterns for efficiency reasons. It also supports dumping and data streaming. We hope this will be an excellent companies along side the efforts we have put on GenStage (and GenStage.Flow):
Initial results indicate that NimbleCSV is 10x faster than CSV when parsing from a file stream, and 10x faster than ExCSV when parsing a literal string (ExCSV doesn’t do streams, CSV doesn’t do raw strings).
Will post full results in a bit, along with CSV writing benchmarks.
I tried looking what would be the performance difference if the separators weren’t bound at compile-time. The difference looks to be more-or-less 1.6x, which is quite surprising - i though it would be much worse.
I am curious if there is planned to be a way to use it into an existing module to turn that module into a parser? It would be useful to add helper functions and parsing module all in one without needing to delegate functions otherwise?
Yeah the Beam VM has a lot of interesting things like that. I ended up making a math module about 6 years ago that got pretty fast after a lot of testing, a lot of weird things ended up being fast, and compile-time generation of structures was absolutely necessary for speed.
But that’s what the library does now anyway, it’s just hidden behind another macro. Using use would improve composability and you wouldn’t have to pass in the moduledocs as an option.