Elasticlunr is a small, full-text search library for use in the Elixir environment. It indexes JSON documents and provides a friendly search interface to retrieve documents.
The library is built for web applications that do not require the deployment complexities of popular search engines while taking advantage of the Beam capabilities.
Imagine how much is gained when the search functionality of your application resides in the same environment (Beam VM) as your business logic; search resolves faster, the number of services (Elasticsearch, Solr, and so on) to monitor reduces.
Thanks, a very interesting project. Is it possible to specify different tokenizers (n-gram etc…) for different fields?
Or is that not intended, because then you should rather use Elasticsearch, Solr or PostgreSQL full text search?
Yes, you can specify different tokenizers for different fields and you can also specify a different tokenizer at query time.
Note that the project wasn’t built to compete with the likes of Elasticsearch or Solr but to provide alternatives to individuals or companies that do not have the capacity or resources to manage these search engines – they all have their learning curves.
I just published an S3 storage provider for Elasticlunr. You can now store your indexes to an S3 bucket aside from the Disk storage provider included in the base project.
The storage API is flexible, so writing to any storage provider (Google Cloud Storage, DB, and so on) shouldn’t be a problem. it’s just a matter of grabbing the right provider or implementing one yourself.
There is no benchmark at the moment because I wanted to have the base functionalities in place before trying to optimize, and I will say that the project is at that stage now. So, in the coming weeks, I will be focusing on performance and hopefully, I’m able to share positive results with the community.
Also, I’m looking for volunteers to join the project.
I’m curious is there a fast way to fetch all possible values for an indexed field? Eg. in the example from LiveBook can we fetch a list of all indexed authors? And how efficient such an operation is?
It’s been a while since the initial release of the project. A lot has changed and improvements (storage and performance) have been made. If you’re looking to stay up to date, see the below discussion: