G GeoStack

llms.txt

What is llms.txt?

llms.txt is a proposed web standard, introduced by Jeremy Howard of Answer.AI in September 2024, that provides a standardized way for websites to offer LLM-friendly content. The core idea: websites place a /llms.txt markdown file at their root, containing a curated overview of the site's content — brief background information, guidance for AI interpretation, and links to detailed markdown versions of key pages.

The standard addresses a fundamental problem: LLMs have limited context windows and struggle to parse complex HTML with navigation, ads, and JavaScript. llms.txt gives them a clean, structured entry point to understand a website's content.

The Problem llms.txt Solves

  • Context window limitations: LLMs cannot ingest entire websites — llms.txt provides a curated summary
  • HTML complexity: Converting HTML with navigation, ads, and scripts to LLM-friendly text is error-prone
  • Content discovery: LLMs don't know which pages are most important — llms.txt guides them
  • Site understanding: A brief project description helps LLMs contextualize the content they consume
  • No existing standard: robots.txt controls crawling; sitemap.xml lists pages; neither is designed for LLM inference-time use

llms.txt Format Specification

A file following the spec contains these sections (in order):

  1. H1 title: Name of the project or site (required)
  2. Blockquote: Short summary with key information for understanding the site
  3. Additional markdown: More detailed information about the project (optional, any markdown except headings)
  4. H2 file lists: Sections of URLs with descriptions where detailed content can be found
  5. Optional section: An H2 titled "Optional" with URLs that can be skipped for shorter context

Basic example:

# Your Site Name

> A brief description of what this site is about and key information for understanding the content.

Additional details about the site, its purpose, and how to interpret its content.

## Docs

- [Getting Started Guide](https://yoursite.com/docs/start.html.md): Introduction and setup instructions
- [API Reference](https://yoursite.com/docs/api.html.md): Complete API documentation
- [FAQ](https://yoursite.com/faq.html.md): Frequently asked questions

## Blog

- [Key Article 1](https://yoursite.com/blog/article1.html.md): Description of article
- [Key Article 2](https://yoursite.com/blog/article2.html.md): Description of article

## Optional

- [Archive](https://yoursite.com/archive.html.md): Older content that may not be essential

.md Page Extensions

The standard also proposes that pages with information useful for LLMs provide a clean markdown version at the same URL with .md appended:

  • https://yoursite.com/pagehttps://yoursite.com/page.md
  • https://yoursite.com/page.htmlhttps://yoursite.com/page.html.md
  • https://yoursite.com/dir/https://yoursite.com/dir/index.html.md

llms.txt vs robots.txt vs sitemap.xml

StandardPurposeConsumerFormat
robots.txtControl crawling accessSearch engine botsPlain text rules
sitemap.xmlList all indexable URLsSearch engine botsXML
llms.txtCurated content for LLM inferenceLLMs and AI agentsMarkdown

Key differences: sitemap.xml lists everything (too large for LLM context), robots.txt controls access (not content), and llms.txt provides a curated entry point with context.

Why llms.txt Matters for GEO

Implementing llms.txt directly supports GEO goals:

  • Improved citation accuracy: LLMs that understand your content structure cite it more accurately
  • Content prioritization: You control which content AI engines see as most important
  • Reduced hallucinations: Curated markdown reduces AI misinterpretation of HTML
  • Brand context: The site summary helps AI engines understand what your brand does
  • Content freshness: Link to your most current content to signal relevance

Implementation Tools

  • llms_txt2ctx CLI: Python tool for parsing llms.txt and generating LLM context files
  • VitePress plugin: Auto-generates llms.txt for VitePress documentation sites
  • Docusaurus plugin: Auto-generates llms.txt for Docusaurus documentation sites
  • Drupal Recipe: Full llms.txt support for Drupal 10.3+ sites
  • llms-txt-php library: Programmatic llms.txt creation and parsing in PHP
  • VS Code PagePilot: Extension that loads llms.txt context into VS Code Chat

Directories of existing llms.txt files: llmstxt.site and directory.llmstxt.cloud

Last updated: June 25, 2026