Information retrieval technique that uses vector embeddings to find results by meaning, not just exact keyword matching.
Semantic search is an information retrieval technique that goes beyond literal word matching. Instead of checking whether a document contains the exact query terms, it converts both the query and documents into numerical vectors — called embeddings — and measures similarity between them in a high-dimensional vector space.
If a user searches for "how to deploy to the cloud," keyword search would only find documents containing those exact words. Semantic search would also surface documents about "AWS deployment," "serverless infrastructure," or "CI/CD with containers" — because it understands the meaning is similar.
The process has three phases:
A language model transforms text into fixed-dimension vectors. Each vector captures the semantic meaning of the text in a space where similar texts end up close together.
Common models for this include:
Document embeddings are stored in a structure that enables efficient similarity search. Options range from a simple in-memory array to specialized vector databases.
The user's query is converted into an embedding using the same model, and cosine similarity is calculated against all indexed documents. Results are ranked by similarity.
query → embedding → cosine similarity → ranked results
The first semantic search implementation on this site used Transformers.js with the Xenova/all-MiniLM-L6-v2 model running directly in the browser via WebAssembly. Embeddings were pre-computed at build time and passed as props to the search page.
Why it was removed. Four issues made it unviable in production:
tolist(), .data, and output[0] all behaved differently depending on context, causing silent failures.The decision and technical details are documented in issue #9 of the repository.
The current search is keyword-based: instant matching against titles, summaries, and tags. Zero dependencies, works synchronously, no loading state needed.
Pre-computed embeddings are still generated at build time (scripts/generate-embeddings.ts) and stored in public/embeddings.json — ready for future use.
To implement semantic search robustly, there are several options depending on scale:
Run the embedding model in a Node.js API route (not in the browser). The model loads once at server startup and processes queries in milliseconds. Viable for sites with moderate traffic.
For large collections, a specialized vector database handles efficient indexing and search:
For static sites with a few dozen documents, pre-computing embeddings at build time and doing cosine search on the client is viable — as long as the model isn't downloaded in the browser. Pre-computed embeddings weigh kilobytes, not megabytes.
Combining keyword search and semantic search typically yields better results than either alone. The typical approach:
Semantic search adds real value when:
For small collections with predictable vocabulary, keyword search is simpler, faster, and effective enough.
Keyword search fails when the user doesn't know the exact terminology. Semantic search closes that gap by understanding the intent behind the query, making it essential for knowledge bases, technical documentation, and any system where the discovery experience matters.
Field of computer science dedicated to creating systems capable of performing tasks that normally require human intelligence, from reasoning and perception to language generation.
Architectural pattern that combines information retrieval from external sources with LLM text generation, reducing hallucinations and keeping knowledge current without retraining the model.
Autonomous systems that combine language models with reasoning, memory, and tool use to execute complex multi-step tasks with minimal human intervention.
Chronicle of building a second brain with a knowledge graph, bilingual pipeline, and agent endpoints — in days, not weeks, and what that teaches about the gap between theory and working systems.
Storage systems specialized in indexing and searching high-dimensional vectors efficiently, enabling semantic search and RAG applications at scale.
Data structures representing knowledge as networks of entities and relationships, enabling reasoning, connection discovery, and semantic queries over complex domains.
Dense vector representations that capture the semantic meaning of text, images, or other data in a numerical space where proximity reflects conceptual similarity.