Vector Databases

What it is

A vector database is a storage system optimized for storing, indexing, and searching high-dimensional vectors — typically embeddings generated by AI models. Unlike traditional databases that search by exact match, vector databases find the vectors most similar to a query.

They're the fundamental infrastructure for semantic search and RAG at scale.

Why not use a traditional database?

Finding the nearest vector among millions requires comparing distances with each one — O(n) per query. Vector databases use approximate indexes (ANN — Approximate Nearest Neighbors) that reduce this to O(log n) or better, sacrificing perfect precision for practical speed.

Indexing algorithms

HNSW (Hierarchical Navigable Small World): navigable graphs enabling efficient search. Most popular for its speed/precision balance
IVF (Inverted File Index): partitions space into clusters and searches only the nearest ones
PQ (Product Quantization): compresses vectors to reduce memory, useful for massive datasets
Flat: exact search without index — precise but slow, only for small datasets

HNSW tuning

HNSW has two key parameters that control the tradeoff between speed and recall:

Parameter	Effect when increased	Typical value
`m` (connections per node)	Better recall, more memory	16–64
`ef_construction` (candidates during build)	Better index quality, slower construction	100–200
`ef_search` (candidates during search)	Better recall, slower search	50–200

The practical rule: start with m=16, ef_construction=128, ef_search=64 and adjust by measuring recall@10 against exact search.

Vector database options

Dedicated databases

Database	Characteristics
Pinecone	Serverless, managed, easy to start
Weaviate	Open-source, GraphQL, integrated ML modules
Qdrant	Open-source, Rust, advanced filters
Milvus	Open-source, scalable, CNCF project
Chroma	Open-source, embeddable, ideal for prototypes

Extensions for existing databases

Database	Extension
PostgreSQL	pgvector
Elasticsearch	Dense vector field
Redis	Redis Stack (RediSearch)
MongoDB	Atlas Vector Search

Example: semantic search with pgvector

pgvector is the most pragmatic option when you already have PostgreSQL — it avoids adding another service to your infrastructure.

-- Enable extension and create table
CREATE EXTENSION IF NOT EXISTS vector;
 
CREATE TABLE documents (
  id SERIAL PRIMARY KEY,
  content TEXT NOT NULL,
  metadata JSONB DEFAULT '{}',
  embedding vector(1536)
);
 
-- Create HNSW index for fast search
CREATE INDEX ON documents
  USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 128);
 
-- Semantic search: top 5 most similar documents
SELECT id, content, 1 - (embedding <=> $1::vector) AS similarity
FROM documents
ORDER BY embedding <=> $1::vector
LIMIT 5;

Hybrid search

Purely vector search fails when the user searches for an exact term (a product name, an error code). Hybrid search combines semantic similarity with full-text search:

-- Hybrid search: combine vector + full-text with RRF
WITH semantic AS (
  SELECT id, ROW_NUMBER() OVER (ORDER BY embedding <=> $1::vector) AS rank_s
  FROM documents
  LIMIT 20
),
keyword AS (
  SELECT id, ROW_NUMBER() OVER (ORDER BY ts_rank(to_tsvector(content), plainto_tsquery($2)) DESC) AS rank_k
  FROM documents
  WHERE to_tsvector(content) @@ plainto_tsquery($2)
  LIMIT 20
)
SELECT COALESCE(s.id, k.id) AS id,
       COALESCE(1.0 / (60 + s.rank_s), 0) + COALESCE(1.0 / (60 + k.rank_k), 0) AS rrf_score
FROM semantic s FULL OUTER JOIN keyword k ON s.id = k.id
ORDER BY rrf_score DESC
LIMIT 5;

This pattern uses Reciprocal Rank Fusion (RRF) to combine rankings from both searches. The constant 60 is the standard value that smooths the fusion.

Design considerations

Dimensionality: more dimensions = more storage and compute. 1536 (OpenAI) vs 384 (MiniLM) makes a difference at scale
Metadata filtering: combining vector search with filters (date, category, user) — pre-filtering reduces the search space but may miss relevant results; post-filtering is more precise but slower
Updates: some indexes are expensive to update — consider batch strategies
Consistency: many vector databases prioritize availability over strict consistency

Why it matters

Vector databases are the infrastructure that makes semantic search and RAG possible at scale. For teams already using PostgreSQL, pgvector eliminates the need for an additional service and covers most cases up to millions of vectors. When volume exceeds that or complex filters with low latency are needed, dedicated databases like Qdrant or Pinecone justify the additional operational complexity.

References

Efficient and robust approximate nearest neighbor search using HNSW — Malkov & Yashunin, 2016. The original HNSW paper.
pgvector: Open-source vector similarity search for Postgres — PostgreSQL extension for vector search.
Qdrant Documentation — Qdrant, 2024. High-performance vector database.
What is a Vector Database? — Pinecone, 2024. Introductory guide with indexing algorithm comparisons.
ANN Benchmarks — Comparative benchmarks for nearest neighbor search algorithms.

What it is

They're the fundamental infrastructure for semantic search and RAG at scale.

Why not use a traditional database?

Indexing algorithms

HNSW (Hierarchical Navigable Small World): navigable graphs enabling efficient search. Most popular for its speed/precision balance
IVF (Inverted File Index): partitions space into clusters and searches only the nearest ones
PQ (Product Quantization): compresses vectors to reduce memory, useful for massive datasets
Flat: exact search without index — precise but slow, only for small datasets

HNSW tuning

HNSW has two key parameters that control the tradeoff between speed and recall:

Parameter	Effect when increased	Typical value
`m` (connections per node)	Better recall, more memory	16–64
`ef_construction` (candidates during build)	Better index quality, slower construction	100–200
`ef_search` (candidates during search)	Better recall, slower search	50–200

The practical rule: start with m=16, ef_construction=128, ef_search=64 and adjust by measuring recall@10 against exact search.

Vector database options

Dedicated databases

Database	Characteristics
Pinecone	Serverless, managed, easy to start
Weaviate	Open-source, GraphQL, integrated ML modules
Qdrant	Open-source, Rust, advanced filters
Milvus	Open-source, scalable, CNCF project
Chroma	Open-source, embeddable, ideal for prototypes

Extensions for existing databases

Database	Extension
PostgreSQL	pgvector
Elasticsearch	Dense vector field
Redis	Redis Stack (RediSearch)
MongoDB	Atlas Vector Search

Example: semantic search with pgvector

pgvector is the most pragmatic option when you already have PostgreSQL — it avoids adding another service to your infrastructure.

-- Enable extension and create table
CREATE EXTENSION IF NOT EXISTS vector;
 
CREATE TABLE documents (
  id SERIAL PRIMARY KEY,
  content TEXT NOT NULL,
  metadata JSONB DEFAULT '{}',
  embedding vector(1536)
);
 
-- Create HNSW index for fast search
CREATE INDEX ON documents
  USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 128);
 
-- Semantic search: top 5 most similar documents
SELECT id, content, 1 - (embedding <=> $1::vector) AS similarity
FROM documents
ORDER BY embedding <=> $1::vector
LIMIT 5;

Hybrid search

Purely vector search fails when the user searches for an exact term (a product name, an error code). Hybrid search combines semantic similarity with full-text search:

-- Hybrid search: combine vector + full-text with RRF
WITH semantic AS (
  SELECT id, ROW_NUMBER() OVER (ORDER BY embedding <=> $1::vector) AS rank_s
  FROM documents
  LIMIT 20
),
keyword AS (
  SELECT id, ROW_NUMBER() OVER (ORDER BY ts_rank(to_tsvector(content), plainto_tsquery($2)) DESC) AS rank_k
  FROM documents
  WHERE to_tsvector(content) @@ plainto_tsquery($2)
  LIMIT 20
)
SELECT COALESCE(s.id, k.id) AS id,
       COALESCE(1.0 / (60 + s.rank_s), 0) + COALESCE(1.0 / (60 + k.rank_k), 0) AS rrf_score
FROM semantic s FULL OUTER JOIN keyword k ON s.id = k.id
ORDER BY rrf_score DESC
LIMIT 5;

This pattern uses Reciprocal Rank Fusion (RRF) to combine rankings from both searches. The constant 60 is the standard value that smooths the fusion.

Design considerations

Dimensionality: more dimensions = more storage and compute. 1536 (OpenAI) vs 384 (MiniLM) makes a difference at scale
Metadata filtering: combining vector search with filters (date, category, user) — pre-filtering reduces the search space but may miss relevant results; post-filtering is more precise but slower
Updates: some indexes are expensive to update — consider batch strategies
Consistency: many vector databases prioritize availability over strict consistency

Why it matters

References

Efficient and robust approximate nearest neighbor search using HNSW — Malkov & Yashunin, 2016. The original HNSW paper.
pgvector: Open-source vector similarity search for Postgres — PostgreSQL extension for vector search.
Qdrant Documentation — Qdrant, 2024. High-performance vector database.
What is a Vector Database? — Pinecone, 2024. Introductory guide with indexing algorithm comparisons.
ANN Benchmarks — Comparative benchmarks for nearest neighbor search algorithms.

Vector Databases

What it is

Why not use a traditional database?

Indexing algorithms

HNSW tuning

Vector database options

Dedicated databases

Extensions for existing databases

Example: semantic search with pgvector

Hybrid search

Design considerations

Why it matters

References

Related content

Vector Databases

What it is

Why not use a traditional database?

Indexing algorithms

HNSW tuning

Vector database options

Dedicated databases

Extensions for existing databases

Example: semantic search with pgvector

Hybrid search

Design considerations

Why it matters

References

Related content