I was recently working through a hobby learning project for retrieval augmented generation, with some bells and whistles ( elixir is not my first language )
It’s probably reinventing the wheel, but hey, wheels can be interesting:
I’ve had an idea for a while now that you could get passable vector search with Nx using xor and popcnt on 1-bit quantized vectors. For those unaware, xor/popcnt are essentially 1-bit cosine distance.
Instead of building an index the idea is that you just brute-force search very quickly with native vectorized instructions, which I figure Nx should be able to do in batches without eating any significant Elixir speed penalty. Unlike vector indexes (e.g. HNSW) this approach has the advantage of being extremely simple and there is no need for external dependencies (just Nx) which I find appealing.
I don’t know all that much about vector search and I haven’t kept up with the state of things for a few months (and I know things are changing that fast). Do you have any thoughts on this?
I like the idea, I haven’t personally tried it out, I’ve tried out some of the other configurable distance types via redis vector search, but didn’t see much of a difference for top k for the data types, I’ve been working with.
Were you thinking just FLAT index, with xorpopcnt?
I see Redis has a “FLAT” index, and I think it’s what I’m talking about but it’s hard to tell because their docs don’t explain what it does at all (unfortunate). What’s weird about that name is a) the brute-force approach is kinda not an index by definition and b) there is a type of vector index called IVFFLAT but I guess this is just a case of unfortunate naming?
But yeah, my understanding is that up to a certain dataset size xor/popcnt on 1-bit quantized vectors as a first pass is actually surprisingly performant. Then you can do reranking either with cosine similarity or, apparently, LLM reranking is pretty good. The benefit of this approach is simplicity; HNSW and co are extremely complicated and expensive to update incrementally. Keeping everything in RAM helps but then you have to keep everything in RAM.
Keep in mind my knowledge is limited and probably like 6mo out of date here so I have no idea what the SOTA is with on-disk vector indexes right now. I think there has been progress.
But as far as “shortest path to a passable pure(ish)-Elixir vector search” this seemed to be the way to go. Of course I have yet to pursue this.