Back to all writing
Fundamentals Jun 7, 2026 5 min read

What is llms.txt? The new AI-search standard, explained.

A plain-English guide to llms.txt — what it is, what's inside it, where it goes, and whether AI models actually read it. Plus the cleanest reason to ship one anyway.

llms.txt is a proposed standard for telling large language models — ChatGPT, Claude, Gemini, Perplexity — what your site is about, which pages they should read, and which to ignore. You publish a single markdown file at the root of your domain. The model finds it. The model uses it. In theory.

In practice, almost nobody has proven it works. Google says it’s unnecessary. Most AI platforms have not confirmed they fetch it. The skeptics call it a solution in search of a problem.

So why is every serious AI-search guide telling you to ship one? Because the file is cheap, the upside is uncapped, and the smart marketers I trust are quietly publishing theirs anyway. Here’s the full picture.

What is llms.txt?

The llms.txt standard was proposed by Jeremy Howard (co-founder of Answer.AI and fast.ai) in September 2024. The pitch is straightforward: large language models have small, expensive context windows. When a model wants to answer a question about your brand, it can’t crawl your whole sitemap. It needs a curated index — a one-page briefing — of the content that matters.

That briefing is llms.txt. A plain markdown file at yoursite.com/llms.txt. Conceptually:

  • robots.txt tells crawlers what they may access.
  • sitemap.xml tells crawlers what exists.
  • llms.txt tells AI models what’s worth reading.

It’s not a permission system. It’s a recommendation system. You curate; the model decides.

There’s also a fuller variant, llms-full.txt, which inlines the actual content of those linked pages into the same file so a model fetching one URL gets the whole context. Anthropic publishes one at docs.anthropic.com/llms-full.txt — useful as a reference for what a serious implementation looks like.

What’s inside the file

The format is opinionated but minimal:

# BrandAxis

> BrandAxis tracks how brands appear in AI-generated answers across ChatGPT,
> Google AI Overviews, Perplexity, Claude, Gemini, and Grok.

## Docs
- [Quick start](https://brandaxis.ai/docs/quick-start/): Set up your first workspace.
- [Glossary](https://brandaxis.ai/docs/glossary/): Core GEO terms.

## Pricing
- [Plans](https://brandaxis.ai/pricing/): Tiered plans from Free to Enterprise.

## Policies
- [Privacy](https://brandaxis.ai/privacy/): How we handle your data.

The structure:

  • H1 — your site or organisation name.
  • Blockquote — a one-line description of what you do, written for a model that has never heard of you.
  • H2 sections — logical groupings: Docs, Pricing, Policies, About, API.
  • Bullets — each is a link and a short description of what’s there.

Keep descriptions tight. Models use them to decide which links to follow.

Where it goes — and who’s actually reading it

Place the file at the root of your domain: https://yoursite.com/llms.txt. No subdomain, no /docs/llms.txt. Same convention as robots.txt.

Who’s shipping one today? A growing list of AI-adjacent companies — Anthropic, Cloudflare, FastHTML, Tinybird, plus every documentation site hosted on Mintlify (which auto-generates the file). The spec is gathering momentum among technical brands.

Who’s confirmed they read it? None of the major LLM providers, publicly. OpenAI hasn’t said. Anthropic hasn’t said. Google has explicitly said the opposite — that the file is unnecessary for inclusion in their generative AI search results. Which brings us to the awkward question.

Does it actually work?

Honest answer: nobody has proven it does.

The strongest evidence we have is server logs. Olivya Pastis at Seer Interactive ran log audits and found that AI crawlers weren’t fetching her clients’ llms.txt files — backing up Google’s stated position. Ahrefs went further, calling the standard “a solution in search of a problem” and drawing the obvious historical parallel: meta keywords, a metadata standard the industry adopted, nobody used, and which quietly died.

So that’s the skeptical case. No public study shows lift. No major platform confirms uptake. The smart bet is that the file does nothing.

Except for one inconvenient detail: llms.txt is referenced inside Google’s own Lighthouse audit documentation. The same company telling you the file is unnecessary is shipping reference docs that point developers toward it. That’s two messages from the same source — and a clue that the position may not stay where it is.

So should you ship one? Pascal’s Wager.

This is where the file gets interesting strategically. Pascal’s seventeenth-century argument for belief in God reduces to a payoff matrix: the cost of believing is finite, the upside if you’re right is infinite, so the expected value points one way regardless of probability.

llms.txt has the same shape:

  • Skip it, does nothing → no harm done.
  • Skip it, ends up mattering → competitors who shipped one eat your visibility on the day the switch flips.
  • Ship it, does nothing → you lost ten minutes.
  • Ship it, ends up mattering → free win. You were early.

The cost is bounded — one markdown file, no ongoing maintenance, no surface area to attack. The upside, however unlikely, is uncapped. Pascal would ship the file.

What to put in your llms.txt

Treat it like a one-page briefing for a model that has never heard of you. Be specific, be canonical, skip the marketing copy.

What belongs in:

  • A one-line description, written for a model. Name your category and your unit of value. (“BrandAxis tracks how brands appear in AI-generated answers” beats “the future of brand intelligence”.)
  • Your top 5–10 canonical pages. Pricing, comparison pages, integration docs, your “what is X” explainers, your /about, your changelog. The pages you’d send a journalist on deadline.
  • Policies and trust signals. Privacy, terms, security posture — the pages a careful model would want to ground a recommendation in.
  • Documentation entry points. Quick start, glossary, API overview. The high-value paths into your knowledge base.

What to leave off: stale blog archives, gated PDFs, legacy redirects, landing pages with tracking parameters, anything you’d be embarrassed to see a model cite verbatim.

Refresh the file when you ship a meaningful new page. That’s it.

Where llms.txt fits in your GEO priority stack

It doesn’t sit near the top. The work that actually moves AI visibility — that I’ve written about in the complete GEO playbook — is clean entity hygiene, third-party citations on the sources models retrieve from, and content that’s structured so models can quote it cleanly. If you don’t know what GEO is yet, start there.

But once those foundations are in place? Ship the file. It costs nothing, and it might one day matter.


Want to know whether the work you’re doing is actually moving the needle in AI answers? Try BrandAxis free in early access — we track how your brand shows up across ChatGPT, Google AI Overviews, Perplexity, Claude, Gemini, and Grok, every day.

Tags Fundamentals GEO Playbook