LLMS.txt

llms.txt is a proposed open standard for helping large language models (LLMs) better understand, navigate, and cite content from websites.

Placed at the root of a website at /llms.txt, the file acts as a curated, AI-friendly guide to a site’s most important content—written in plain Markdown so both machines and humans can read it easily.

Think of it as the modern evolution of robots.txt, but instead of telling crawlers what to avoid, it tells AI systems where to find your best content.

Historical Origins

The llms.txt standard was first proposed on September 3, 2024 by Jeremy Howard, Australian technologist, co-founder of fast.ai, and founder of Answer.AI.

Howard originally published the proposal as a blog post at answer.ai, framing it not as a finalized standard but as a starting point for community discussion and experimentation.

His motivation was practical: as LLMs increasingly rely on website content to assist users in real time—during inference, not just during training—the process of assembling relevant context from a site was ambiguous and inefficient.

Do you crawl the entire sitemap? Include external links? Include source code?

Howard’s answer was simple: let the site author decide, and give them a standardized file format to communicate that decision.

The Problem llms.txt Solves

Large language models face a fundamental technical constraint: context windows are too small to handle most websites in their entirety. Converting complex HTML pages filled with navigation menus, advertisements, JavaScript, and boilerplate into clean, usable text is both difficult and imprecise.

The result is that AI tools often pull from whatever they can parse fastest—which may include outdated pages, duplicate content, or low-signal sources rather than your most authoritative, carefully crafted content.

Without a guiding file, LLMs are essentially navigating your website blindfolded, guessing at what matters most.

llms.txt solves this by giving site owners a standardized way to say:

“Here are the pages that matter. Here’s what my site is about. Here’s how to understand it.”

Modern Definition

At its core, llms.txt is a plain text file written in Markdown, placed at the root directory of a website (yourdomain.com/llms.txt).

It provides:

A concise description of the website or project
Curated links to the most important and LLM-readable pages
Optional contextual notes explaining what each linked resource covers
Guidance on how to interpret the site’s content and structure

The file is specifically designed for inference-time use—meaning it helps LLMs give better answers to users right now, rather than being primarily aimed at training future models.

How llms.txt Differs from robots.txt and sitemap.xml

Many people initially compare llms.txt to existing web standards, but the differences are significant.

File	Purpose	Format	Audience	Content
`robots.txt`	Control crawler access	Custom syntax	Search engine bots	Allow/disallow rules
`sitemap.xml`	List all URLs for indexing	XML	Search engine bots	Full URL list with metadata
`llms.txt`	Curate key content for LLMs	Markdown	LLMs and AI agents	Structured links with descriptions

Unlike robots.txt, llms.txt contains no blocking or disallow directives—it is purely a positive, affirmative guide to your best content.

Unlike sitemap.xml, which lists every indexable page, llms.txt is a curated subset of the most important content, designed to fit within an LLM’s context window.

The File Format Explained

The llms.txt specification uses Markdown because it is the format most widely and easily understood by language models, while also remaining human-readable and parseable by standard programming tools.

Required and Optional Sections

A valid llms.txt file follows a specific structure in this order:

An H1 heading — The name of the project or site (only required element)
A blockquote — A short summary containing key information necessary for understanding the file
Body sections — Paragraphs or lists with more detailed background information
H2-delimited file lists — Sections containing curated URLs with optional descriptions
An “Optional” section — Links that can be skipped if a shorter context is needed

Example llms.txt File

Here is a simplified example based on the official specification:

# My Company Name

> We build project management tools for remote teams.

Our platform integrates with Slack, Notion, and GitHub.

## Docs

- [Getting Started](https://example.com/docs/start.md): Setup guide for new users
- [API Reference](https://example.com/docs/api.md): Full API documentation

## About

- [Company Overview](https://example.com/about.md): Mission and team

## Optional

- [Case Studies](https://example.com/case-studies.md): Customer success stories

The `.md` Companion Convention

The llms.txt proposal also includes a companion convention: making clean Markdown versions of web pages available at the same URL as the original page, but with .md appended.

For example, yoursite.com/blog/post would have a corresponding yoursite.com/blog/post.md that strips away HTML, navigation, and other non-content elements to deliver clean, LLM-digestible text.

llms-full.txt

A common variation is llms-full.txt, which expands the llms.txt index into a single large file containing the complete flattened text of the entire website rather than just links and descriptions.

This approach trades compactness for completeness, and some sites use both files simultaneously—the standard llms.txt for quick context assembly and the full version for deep site analysis.

Relationship to Generative Engine Optimization (GEO)

As AI-powered search engines like Perplexity, ChatGPT Search, and Google’s AI Overviews become primary discovery channels, a new discipline called Generative Engine Optimization (GEO) has emerged alongside traditional SEO.

While SEO focuses on ranking in search results, GEO focuses on being the source that AI systems cite when generating answers. llms.txt directly supports GEO strategy by:

Clarifying canonical sources: Specifying which URLs represent your definitive content, reducing the risk of AI citing outdated or duplicate pages
Prioritizing high-quality content: Ensuring thought leadership, case studies, and cornerstone content are visible to AI systems
Protecting brand narrative: Directing LLMs to preferred content to reduce inaccurate or generic AI-generated descriptions of your business
Supporting AI search indexes: Helping your content surface in Perplexity, ChatGPT, and other AI-driven search layers

Who Is Using llms.txt?

Adoption has grown rapidly since the proposal’s release in September 2024, though it remains concentrated in the developer and technology community.

Companies like Perplexity, Anthropic, and various developer documentation platforms have created llms.txt files for their own documentation and internal use.

As of June 2025, a scan of the top 1,000 most visited global websites showed approximately 0.3% adoption (3 out of 1,000 sites), suggesting the standard is still in early-adopter territory among mainstream web properties.

However, adoption among developer tools, SaaS documentation, and AI-adjacent companies is significantly higher.

Notable early adopters include:

Anthropic: Uses llms.txt in internal documentation for agent-building
FastHTML and nbdev projects: All fast.ai and Answer.AI software projects using nbdev automatically generate .md versions of all pages
Perplexity: Has developed llms.txt files for its own documentation
Mintlify, GitBook, and other documentation platforms: Have built native llms.txt generation into their platforms

Current Debate and Honest Limitations

It is important to approach llms.txt with measured expectations.

The standard remains a proposal, not an adopted protocol, and there are legitimate questions about its real-world impact.

Key Criticisms

LLMs may not actually read it: Critics point out that there is limited verified evidence that major AI systems—including ChatGPT, Claude, and Gemini—actively read and prioritize llms.txt files during inference. Google’s John Mueller has stated he doesn’t know of any search systems that use the file.

No enforcement mechanism: Like robots.txt, llms.txt can be obeyed or ignored by any AI agent—there is no technical enforcement. Its effectiveness depends entirely on voluntary adoption by LLM providers.

Not a ranking signal: llms.txt does not guarantee that a brand will appear in AI-generated answers. AI search does not operate based on a single file.

The illusory truth effect: The standard has spread rapidly through SEO and marketing communities, which some argue has outpaced actual evidence of its effectiveness.

The Measured Case for Adopting It Anyway

Despite these limitations, there are practical reasons to implement llms.txt:

Google crawls it: Google has been observed crawling llms.txt files weekly, and in December 2025 stated that adding one would not harm a website.
Legitimate AI crawlers check for it: Server logs show GPTBot, ClaudeBot, and PerplexityBot do access the file.
Future-proofing: As AI search evolves, having a well-structured llms.txt positions you ahead of adoption curves.
Content audit value: The process of creating llms.txt forces a useful exercise in identifying your most important content.
Low implementation cost: Creating a basic llms.txt file takes minutes and costs nothing.

How to Create an llms.txt File

Step 1: Audit Your Most Important Content

Identify the pages that best represent your brand, products, services, and expertise. Focus on quality over quantity—the goal is a curated guide, not an exhaustive sitemap.

Step 2: Write Clear, Neutral Descriptions

LLMs perform best with content that clearly defines terms, avoids emotional language, and does not rely on context-free marketing claims. Instead of “an innovative, groundbreaking platform,” use “an analytics platform for monitoring and analyzing user behavior.”

Step 3: Structure the File in Markdown

Create your H1 title, add a blockquote summary, and build your H2 sections with linked file lists. Keep descriptions concise and informative.

Step 4: Place It at Your Domain Root

Host the completed file at yourdomain.com/llms.txt where AI crawlers expect to find it.

Step 5: Optionally Create .md Page Versions

For maximum AI readability, create Markdown versions of your key pages by appending .md to their URLs.

Step 6: Test and Monitor

Use a tool like llms_txt2ctx to expand your file into a full LLM context file and test whether AI systems can accurately answer questions about your content. Monitor server logs for bot access from GPTBot, ClaudeBot, and PerplexityBot.

Tools and Plugins for llms.txt

A growing ecosystem of tools supports llms.txt creation and management:

llms_txt2ctx: Official CLI and Python module for parsing llms.txt files and generating expanded LLM context
vitepress-plugin-llms: VitePress plugin that auto-generates LLM-friendly documentation
docusaurus-plugin-llms: Docusaurus plugin for LLM-friendly docs following the llms.txt standard
llms-txt-php: A PHP library for reading and writing llms.txt files
Drupal LLM Support: A Drupal Recipe providing full llms.txt support for Drupal 10.3+ sites
GitBook: Native llms.txt generation built into the documentation platform
VS Code PagePilot Extension: Automatically loads external context from llms.txt files for enhanced responses

Use Cases by Industry

Software and Developer Documentation

The most natural fit for llms.txt—developers often use AI assistants while coding and need accurate, up-to-date library documentation. llms.txt ensures AI coding tools reference current API documentation rather than outdated information.

E-commerce

llms.txt can outline product categories, return policies, shipping information, and FAQ content—ensuring AI assistants give accurate answers when customers ask questions about a store.

Professional Services and B2B

Agencies, consultancies, and SaaS companies can use llms.txt to ensure AI systems accurately describe their services, expertise, and differentiators.

Publishing and Media

Content publishers can curate their most authoritative editorial content, helping AI systems cite original reporting rather than aggregated or republished versions.

Personal and Portfolio Sites

Individuals can use llms.txt to help AI systems accurately answer questions about their background, work, and expertise.

Future Outlook

The llms.txt standard sits at the intersection of several major trends reshaping the web: the rise of AI-powered search, the shift from click-based to answer-based information retrieval, and the growing importance of structured content for machine comprehension.

As AI search adoption accelerates—with OpenAI reporting roughly 700 million weekly active users and Google’s Gemini reaching 400 million monthly active users—the strategic importance of being accurately represented in AI-generated answers will only increase. Whether llms.txt becomes the definitive standard for AI content curation or is superseded by a more formalized protocol, the underlying principle—giving site owners a voice in how AI understands their content—is likely here to stay.

llms.txt represents a practical, low-cost step that website owners can take today to improve how AI systems understand and represent their content. While it is not a magic bullet for AI search visibility and its adoption by major LLMs remains inconsistent, the combination of minimal implementation cost, growing crawler interest, and the strategic importance of AI content optimization makes it a worthwhile addition to any modern web content strategy.

For web developers, marketers, and content strategists navigating the shift from traditional SEO to Generative Engine Optimization, llms.txt is less about immediate, measurable impact and more about positioning—ensuring your most important content is clearly organized, accurately described, and ready for the AI-driven web that is already here.

Table of Contents

Historical Origins

The Problem llms.txt Solves

Modern Definition

How llms.txt Differs from robots.txt and sitemap.xml

The File Format Explained

Required and Optional Sections

Example llms.txt File

The `.md` Companion Convention

llms-full.txt

Relationship to Generative Engine Optimization (GEO)

Who Is Using llms.txt?

Current Debate and Honest Limitations

Key Criticisms

The Measured Case for Adopting It Anyway

How to Create an llms.txt File

Step 1: Audit Your Most Important Content

Step 2: Write Clear, Neutral Descriptions

Step 3: Structure the File in Markdown

Step 4: Place It at Your Domain Root

Step 5: Optionally Create .md Page Versions

Step 6: Test and Monitor

Tools and Plugins for llms.txt

Use Cases by Industry

Software and Developer Documentation

E-commerce

Professional Services and B2B

Publishing and Media

Personal and Portfolio Sites

Future Outlook

Leave a ReplyCancel Reply

My favorite website

Sockpuppet Account Creation – My Process

Quisque Iddiam Velquam Elementum Pulvinar

AEO and LLMO for Music Marketing

Restaurant Marketing

Table of Contents

Historical Origins

The Problem llms.txt Solves

Modern Definition

How llms.txt Differs from robots.txt and sitemap.xml

The File Format Explained

Required and Optional Sections

Example llms.txt File

The .md Companion Convention

llms-full.txt

Relationship to Generative Engine Optimization (GEO)

Who Is Using llms.txt?

Current Debate and Honest Limitations

Key Criticisms

The Measured Case for Adopting It Anyway

How to Create an llms.txt File

Step 1: Audit Your Most Important Content

Step 2: Write Clear, Neutral Descriptions

Step 3: Structure the File in Markdown

Step 4: Place It at Your Domain Root

Step 5: Optionally Create .md Page Versions

Step 6: Test and Monitor

Tools and Plugins for llms.txt

Use Cases by Industry

Software and Developer Documentation

E-commerce

Professional Services and B2B

Publishing and Media

Personal and Portfolio Sites

Future Outlook

Related posts:

Leave a ReplyCancel Reply

My favorite website

Sockpuppet Account Creation – My Process

Quisque Iddiam Velquam Elementum Pulvinar

AEO and LLMO for Music Marketing

Restaurant Marketing

The `.md` Companion Convention