MosaicDB - a hobby project that uses Nx (distributed SQL vectordb rag)

Hah, I know you’re joking but architecturally Hobbes is FDB. The performance characteristics will be exactly the same. You could run it in-memory on a single node but then it would just be a slower :ets with concurrency control and I’m not sure that’s very useful on its own.

I respect that you actually tried with HNSW. I was very confused until I realized all of the graph DB companies are literally just holding the entire thing in RAM. That might scale to terabytes but you’re certainly not getting to petabytes, especially now that DRAM prices have randomly doubled lol. So it’s a no-go for web scale search (i.e. a literal web search engine) until they figure out how to get it onto disk.

As a digression, I have also been thinking about whether I can dodge vector search altogether. My interest was always in small-scale search for consumer apps (e.g. a note-taking app or a chat app, not web scale). I have been critical of AI stuff in some contexts, but for search this stuff is like actual magic. What used to take a research team and millions of dollars is now just straight up trivial, and that is 100% a good thing (and will probably end up slaying some monopolies).

But for search, I’ve been thinking: what about if instead of vector search, you just use a traditional keyword search engine (trivial to build) and then just ask a stupid LLM to rewrite the query 20 times and then rank the results.

I have no evidence but I am like 90% sure this would totally work and I know exactly how to build a keyword search engine (it’s literally spelled out in the record layer paper) so I find this approach even more enticing than the vector stuff.

Vectors are still very interesting, though.

1 Like