Concepts

llms.txt

Proposed standard for publishing a Markdown file at a website's root that enables language models to efficiently understand and use the site's content at inference time.

growing#llms-txt#ai#web-standards#seo#agents#markdown#inference

What it is

llms.txt is a standard proposed by Jeremy Howard (fast.ai) in September 2024 for placing a Markdown file at a website's /llms.txt path. Its purpose is to offer language models a concise, structured, and readable version of the site's most important content — without the noise of HTML, navigation, ads, or JavaScript.

It's analogous to robots.txt and sitemap.xml, but with a different goal:

  • robots.txt tells crawlers what access is acceptable
  • sitemap.xml lists all indexable pages for search engines
  • llms.txt provides a curated summary and links to detailed content for language models

Why it matters

Language models face a fundamental limitation when interacting with websites: context windows are too small to process an entire site, and converting complex HTML to plain text is imprecise and noisy.

llms.txt solves this by providing:

  1. Immediate context — a site summary that fits in a context window
  2. Structured navigation — links to detailed Markdown files organized by section
  3. Curated information — only relevant content, no duplication or noise
  4. Human and machine readable format — Markdown is the most widely understood format by current LLMs

How it's used

At inference time

The primary use case is during inference — when a user asks a language model for information. For example:

  • A developer includes a library's documentation in their AI-assisted IDE
  • A chatbot with search capability queries a site to answer questions
  • An AI agent needs to understand a service's structure to interact with it

File format

The file follows a specific Markdown structure:

# Project name
 
> Brief description with key information
 
Additional details about the project.
 
## Section
 
- [Link title](https://url): Optional notes about the file
 
## Optional
 
- [Link title](https://url): Secondary content that can be skipped

The "Optional" section has special meaning: links there can be skipped if a shorter context is needed.

Common variants

Many sites publish expanded variants:

  • /llms.txt — the base file with summary and links
  • /llms-full.txt — expanded version with the full content of each link embedded

Implementation on this site

This site publishes two files auto-generated by the knowledge pipeline:

  • /llms.txt — index with title, type, and English summary for each knowledge node
  • /llms-full.txt — full content of each article in plain format

Both are regenerated with every pnpm generate run and served as static files from public/.

Relationship with other standards

StandardAudiencePurpose
robots.txtCrawlersAccess control
sitemap.xmlSearch enginesPage index
llms.txtLanguage modelsCurated site summary
MCPAI agentsTools and context protocol

llms.txt and MCP are complementary: llms.txt provides static readable content, while MCP enables dynamic interactions with tools and services.

Adoption

Since its proposal in 2024, llms.txt has been adopted by technical documentation projects, e-commerce sites, educational institutions, and personal websites. The specification is deliberately simple — a Markdown file with minimal conventions — making it easy to adopt without specialized tooling.

References

Concepts