Vector Databases
Storage systems specialized in indexing and searching high-dimensional vectors efficiently, enabling semantic search and RAG applications at scale.
What it is
A vector database is a storage system optimized for storing, indexing, and searching high-dimensional vectors — typically embeddings generated by AI models. Unlike traditional databases that search by exact match, vector databases find the vectors most similar to a query.
They're the fundamental infrastructure for semantic search and RAG at scale.
Why not use a traditional database?
Finding the nearest vector among millions requires comparing distances with each one — O(n) per query. Vector databases use approximate indexes (ANN — Approximate Nearest Neighbors) that reduce this to O(log n) or better, sacrificing perfect precision for practical speed.
Indexing algorithms
- HNSW (Hierarchical Navigable Small World): navigable graphs enabling efficient search. Most popular for its speed/precision balance
- IVF (Inverted File Index): partitions space into clusters and searches only the nearest ones
- PQ (Product Quantization): compresses vectors to reduce memory, useful for massive datasets
- Flat: exact search without index — precise but slow, only for small datasets
Vector database options
Dedicated databases
| Database | Characteristics |
|---|---|
| Pinecone | Serverless, managed, easy to start |
| Weaviate | Open-source, GraphQL, integrated ML modules |
| Qdrant | Open-source, Rust, advanced filters |
| Milvus | Open-source, scalable, CNCF project |
| Chroma | Open-source, embeddable, ideal for prototypes |
Extensions for existing databases
| Database | Extension |
|---|---|
| PostgreSQL | pgvector |
| Elasticsearch | Dense vector field |
| Redis | Redis Stack (RediSearch) |
| MongoDB | Atlas Vector Search |
Design considerations
- Dimensionality: more dimensions = more storage and compute. 1536 (OpenAI) vs 384 (MiniLM) makes a difference at scale
- Hybrid filtering: combining vector search with metadata filters (date, category, user)
- Updates: some indexes are expensive to update — consider batch strategies
- Consistency: many vector databases prioritize availability over strict consistency
Typical usage pattern
1. Document → Chunking → Embedding model → Vector
2. Vector + metadata → Vector DB (insert)
3. Query → Embedding model → Vector
4. Vector → Vector DB (search) → Top-K results
5. Results → LLM context → Response
Why it matters
Vector databases are the infrastructure that makes semantic search and RAG possible at scale. Choosing between pgvector, Pinecone, Weaviate, or Qdrant depends on data volume, latency requirements, and whether you already have PostgreSQL in your stack.
References
- Efficient and robust approximate nearest neighbor search using HNSW — Malkov & Yashunin, 2016. The HNSW paper.
- pgvector: Open-source vector similarity search for Postgres — PostgreSQL extension for vectors.
- Qdrant Documentation — Qdrant, 2024. High-performance vector database.