Show HN: TurboQuant for vector search – 2-4 bit compression

https://news.ycombinator.com/rss Hits: 2
Summary

Rust implementation of TurboQuant for vector search, with Python bindings via PyO3. Compresses high-dimensional vectors to 2-4 bits per coordinate with near-optimal distortion. Data-oblivious (no training), zero indexing time. Unofficial implementation of TurboQuant (Google Research, ICLR 2026). from turbovec import TurboQuantIndex index = TurboQuantIndex(dim=1536, bit_width=4) index.add(vectors) index.add(more_vectors) scores, indices = index.search(query, k=10) index.write("my_index.tq") loaded = TurboQuantIndex.load("my_index.tq") use turbovec::TurboQuantIndex; let mut index = TurboQuantIndex::new(1536, 4); index.add(&vectors); let results = index.search(&queries, 10); index.write("index.tv").unwrap(); let loaded = TurboQuantIndex::load("index.tv").unwrap(); TurboQuant vs FAISS IndexPQFastScan on OpenAI DBpedia d=1536 (100K vectors, 1K queries, k=64). FAISS PQ configurations sized to match TurboQuant compression ratios. TurboQuant requires zero training. FAISS PQ needs a training step (4-10 seconds). TurboQuant index build is 3-4x faster. ARM (Apple Silicon M3 Max) TQ speed FAISS speed Ratio TQ recall@1 FAISS recall@1 2-bit MT 0.125ms/q 0.128ms/q 0.97x 0.870 0.882 2-bit ST 1.272ms/q 1.247ms/q 1.02x 0.870 0.882 4-bit MT 0.232ms/q 0.246ms/q 0.94x 0.955 0.930 4-bit ST 2.474ms/q 2.485ms/q 1.00x 0.955 0.930 On ARM, TurboQuant matches or beats FAISS on speed while requiring no training step. At 4-bit, TurboQuant recall is higher than FAISS (0.955 vs 0.930). x86 (Intel Sapphire Rapids, 4 vCPUs) TQ speed FAISS speed Ratio TQ recall@1 FAISS recall@1 2-bit MT 0.733ms/q 0.590ms/q 1.24x 0.870 0.882 2-bit ST 1.443ms/q 1.208ms/q 1.19x 0.870 0.882 4-bit MT 1.391ms/q 1.181ms/q 1.18x 0.955 0.930 4-bit ST 2.998ms/q 2.477ms/q 1.21x 0.955 0.930 On x86, TurboQuant is within 18-25% of FAISS on speed. At 4-bit, TurboQuant recall is higher than FAISS (0.955 vs 0.930). The speed gap is primarily from TurboQuant's rotation step (~5% of total time) and differences in AVX2 code generation v...

First seen: 2026-04-03 17:14

Last seen: 2026-04-03 18:15