May I present
exor_filter! You can find the library on my github. It is a nif wrapper for the xor_filter. A xor_filter is like a bloom filter, but is faster, has a smaller memory footprint, and has a smaller false-positive rate than bloom filters!
There are a good amount of use cases, but I was planning on using it for a bad-word filter for the commenting system on my personal blog. I think this one will cover l33t speak better than a bloom filter, because it is less memory constrained.
An issue that I believe exists is loading a large volume of values on initialization. If this is done other on startup, it could delay the VM due to it not being a dirty nif. However, it wouldn’t make sense to make it dirty due to the low access time of the filter. Its a bit of a catch 22. Using the
xor8_buffered_initialize function could help, as the original repo says the buffered initialization is faster.
More information could be found in the linked original repository.