Elasticlunr - This a small, full-text search library for use in the Elixir environment

heywhy · January 9, 2022, 6:21am

Elasticlunr is a small, full-text search library for use in the Elixir environment. It indexes JSON documents and provides a friendly search interface to retrieve documents.

The library is built for web applications that do not require the deployment complexities of popular search engines while taking advantage of the Beam capabilities.

Imagine how much is gained when the search functionality of your application resides in the same environment (Beam VM) as your business logic; search resolves faster, the number of services (Elasticsearch, Solr, and so on) to monitor reduces.

Read more: Introduction to Elasticlunr
Github: https://github.com/heywhy/ex_elasticlunr

Werner · January 10, 2022, 9:33am

Thanks, a very interesting project. Is it possible to specify different tokenizers (n-gram etc…) for different fields?
Or is that not intended, because then you should rather use Elasticsearch, Solr or PostgreSQL full text search?

heywhy · January 10, 2022, 12:09pm

Yes, you can specify different tokenizers for different fields and you can also specify a different tokenizer at query time.

Note that the project wasn’t built to compete with the likes of Elasticsearch or Solr but to provide alternatives to individuals or companies that do not have the capacity or resources to manage these search engines – they all have their learning curves.

Werner · January 10, 2022, 1:40pm

Okay, that sounds good, would be nice to see possibly an example where you can define the tokenizer and specify it for a field, thanks!

heywhy · January 10, 2022, 2:27pm

Sure. The docs will be improved over the coming days.

heywhy · January 11, 2022, 2:38pm

I just published an S3 storage provider for Elasticlunr. You can now store your indexes to an S3 bucket aside from the Disk storage provider included in the base project.

The storage API is flexible, so writing to any storage provider (Google Cloud Storage, DB, and so on) shouldn’t be a problem. it’s just a matter of grabbing the right provider or implementing one yourself.

victorbjorklund · January 11, 2022, 3:21pm

Cool. Any benchmarks so we get a feel for the performance?

heywhy · January 11, 2022, 5:45pm

There is no benchmark at the moment because I wanted to have the base functionalities in place before trying to optimize, and I will say that the project is at that stage now. So, in the coming weeks, I will be focusing on performance and hopefully, I’m able to share positive results with the community.

Also, I’m looking for volunteers to join the project.

victorbjorklund · January 11, 2022, 5:47pm

I would love to help but I’m new to elixir and no nothing about search algos

AndrewDryga · January 14, 2022, 2:31pm

I’m curious is there a fast way to fetch all possible values for an indexed field? Eg. in the example from LiveBook can we fetch a list of all indexed authors? And how efficient such an operation is?

heywhy · January 15, 2022, 12:15am

Hello @AndrewDryga, you can use the below snippet:

field = Index.get_field(index, "author")

field.documents # this is a map of indexed documents with their values for that field (author).

It’s a fast operation.

heywhy · March 1, 2022, 11:55pm

It’s been a while since the initial release of the project. A lot has changed and improvements (storage and performance) have been made. If you’re looking to stay up to date, see the below discussion: