Stop hitting /api/search.
Latency kills UI. If I want to search my local notes or chat history, I shouldn't need a server roundtrip.
Client-Side Vector DBs allow you to index thousands of documents in the user's RAM and search them semantically in < 5ms.
02. Meet Orama
Orama (formerly Lyra) is a pure JS, immutable vector database. It supports fuzzy search, faceting, and vector embeddings out of the box.
Deep Dive: Memory Constraints & Quantization
Browser tabs crash if you use too much RAM (~2-4GB limit).
To store 100k vectors locally, use Quantization. Converting 32-bit float vectors (float32) to 8-bit integers (int8) reduces memory usage by 75% with barely any loss in search accuracy.
04. The Senior Engineer's Take
Sync is Hard
The challenge isn't searching; it's syncing. If the user edits a note on their phone, how do you update the local index on their laptop?
Pattern: Use CRDTs (Yjs/Automerge) for the data propagation, and re-index the Vector DB on change events.
Zero-Latency UX
With local vector search, you can remove the "Debounce" (waiting 300ms for user to stop typing). Search on every keystroke. It feels magical and responsive compared to server-side search.