Show HN: Building a web search engine from scratch with 3B neural embeddings
TL;DR Highlight
One developer built a web search engine with 3 billion SBERT embeddings and 280 million indexed pages in 2 months — showing the real architecture and cost structure of a vector search-based system at scale.
Who Should Read
Backend/infrastructure engineers building vector search or embedding-based retrieval systems, or developers curious about the real-world architecture of large-scale crawling and indexing pipelines.
Core Mechanics
- Instead of keyword matching, uses SBERT (Multi-QA-MPNET-base-dot-v1, 768-dim) embeddings for intent-based search — e.g., a vague query like 'why isn't CORS working' finds exact eventual consistency answers in S3 docs.
- Generated 3 billion embeddings at 100K/sec using 250 Runpod RTX 4090 GPUs — total compute cost under $50K.
- Used Hetzner auction servers (42x cheaper than AWS for high-memory) and Runpod GPUs (4.3x cheaper) to dramatically cut costs vs. mainstream cloud.
- Built custom HNSW + RocksDB instead of managed vector DB services for cost control at scale.
Evidence
- Community reactions ranged from 'this is a 10x engineer building Google in spare time' to offers of seed investment based on the $50K operating cost.
- Pure vector search limitations noted — searching 'garbanzo bean stew' returned wrong bean recipes, highlighting the need for hybrid approaches combining keyword and semantic search.
- Architecture choices (Hetzner over AWS, custom HNSW over managed vector DB) were validated as pragmatic cost optimizations.
How to Apply
- For embedding-based search systems: combine Hetzner auction servers (42x cheaper high-memory) and Runpod GPUs (4.3x cheaper) instead of AWS to dramatically cut costs. Consider custom HNSW + RocksDB over managed vector DB services like Turbopuffer.
- To improve chunking quality in RAG pipelines: go beyond simple token-count splitting. Use semantic boundaries (paragraph breaks, section headers) and overlap chunks for better retrieval.
- Pure vector search has limitations — consider hybrid search combining BM25 keyword matching with semantic embedding similarity for production systems.
Code Example
snippet
// RocksDB optimization settings (Rust)
opt.set_max_background_jobs(num_cpus::get() as i32 * 2);
opt.set_bytes_per_sync(1024 * 1024 * 4);
opt.set_enable_blob_files(true);
opt.set_min_blob_size(1024); // Separate values over 1KB into BlobDB
// Block cache: 32GB, write buffer: 256MBTerminology
SBERTA model that converts entire sentences into single vectors (number arrays). Comparing two sentence vectors gives you semantic similarity.
HNSWHierarchical Navigable Small World. An index structure for finding the most similar vectors among billions quickly. An approximate nearest neighbor algorithm that trades slight accuracy for speed.
RocksDBA high-performance key-value store by Meta, optimized for SSD storage. Used as the underlying storage engine for the vector index.