Proposed standard for publishing a Markdown file at a website's root that enables language models to efficiently understand and use the site's content at inference time.
llms.txt is a standard proposed by Jeremy Howard (fast.ai) in September 2024 for placing a Markdown file at a website's /llms.txt path. Its purpose is to offer language models a concise, structured, and readable version of the site's most important content — without the noise of HTML, navigation, ads, or JavaScript.
It's analogous to robots.txt and sitemap.xml, but with a different goal:
robots.txt tells crawlers what access is acceptablesitemap.xml lists all indexable pages for search enginesllms.txt provides a curated summary and links to detailed content for language modelsLanguage models face a fundamental limitation when interacting with websites: context windows are too small to process an entire site, and converting complex HTML to plain text is imprecise and noisy.
llms.txt solves this by providing:
The primary use case is during inference — when a user asks a language model for information. For example:
The file follows a specific Markdown structure:
# Project name
> Brief description with key information
Additional details about the project.
## Section
- [Link title](https://url): Optional notes about the file
## Optional
- [Link title](https://url): Secondary content that can be skippedThe "Optional" section has special meaning: links there can be skipped if a shorter context is needed.
Many sites publish expanded variants:
/llms.txt — the base file with summary and links/llms-full.txt — expanded version with the full content of each link embeddedThis site publishes two files auto-generated by the knowledge pipeline:
/llms.txt — index with title, type, and English summary for each knowledge node/llms-full.txt — full content of each article in plain formatBoth are regenerated with every pnpm generate run and served as static files from public/.
| Standard | Audience | Purpose |
|---|---|---|
robots.txt | Crawlers | Access control |
sitemap.xml | Search engines | Page index |
llms.txt | Language models | Curated site summary |
| MCP | AI agents | Tools and context protocol |
llms.txt and MCP are complementary: llms.txt provides static readable content, while MCP enables dynamic interactions with tools and services.
Since its proposal in 2024, llms.txt has been adopted by technical documentation projects, e-commerce sites, educational institutions, and personal websites. The specification is deliberately simple — a Markdown file with minimal conventions — making it easy to adopt without specialized tooling.
Field of computer science dedicated to creating systems capable of performing tasks that normally require human intelligence, from reasoning and perception to language generation.
Open protocol created by Anthropic that standardizes how AI applications connect with external tools, data, and services through a universal interface.
Chronicle of building a second brain with a knowledge graph, bilingual pipeline, and agent endpoints — in days, not weeks, and what that teaches about the gap between theory and working systems.
Architectural pattern that combines information retrieval from external sources with LLM text generation, reducing hallucinations and keeping knowledge current without retraining the model.
Practice of treating documentation with the same tools and processes as code: versioned in Git, reviewed in PRs, and automatically generated when possible.