Efuse_filter, a Binary Fuse Filter NIF, They're faster and smaller than bloom, cuckoo, and xor filters

Hi all,

This is a zero dependency library that implements a NIF wrapper for a Binary Fuse Filter.

This library is available on GitHub and hex.pm.

Some may remember my previous NIF library, the exor_filter. This library is the successor. A Binary Fuse Filter is a probabilistic datastructure like bloom or xor filters, but they are smaller in size and faster to access.

Here is a small benchmark comparing access time against the xor_filter and a few bloom filters implemented in Erlang, Elixir and Rust. These results were measured using the exor_benchmark repo which has the results of the full benchmark.

Features:

  • Serialization and Deserialization to binary, so it can be sent over the wire.
  • A dirty NIF API for large inputs.
  • Incremental initialization, items can be added over time. Once initialized, no more items can be added, Binary Fuse Filters are immutable.
  • Custom hashing. By default this uses erlang:phash2/1. Passing pre-hashed data is supported.

Here is a small example of how to use it:

filter = :fuse8.new(["cat", "dog", "mouse"]),
:true = :fuse8.contain(filter, "cat"),
:false = :fuse8.contain(filter, "goose").

Thanks for reading!

3 Likes