Discovery5 min read

llms.txt: What Ecommerce Should Know

In September 2024, Answer.AI's Jeremy Howard proposed llms.txt — a standard for structuring web content so AI systems can read it more effectively. The technical case is interesting. Whether it matters for ecommerce is a more honest question.

Topics

structured-data ai-search strategy uk-retail

Sarah Chen

Senior Editor

—14 October 2024

There is a particular kind of technology proposal that arrives with a tidy analogy, a working demo, and a genuine problem to solve, and then sits in the "promising but unproven" category for a year or two before you find out which way it went. llms.txt is one of those proposals.

In September 2024, Jeremy Howard of Answer.AI proposed a standard for how websites should make their content useful to large language models. The analogy is explicit: a plain-text file at the root of your domain, telling AI systems what your site is about and where to find the important content. Robots.txt, but for AI.

The technical case for it is sound. Whether it will matter in practice is more complicated.

What the proposal is

The problem llms.txt addresses is real. Modern websites are not built for machine consumption. They are built for humans, with navigation bars, JavaScript-rendered content, cookie consent overlays, and all the other furniture of commercial web design. AI systems that crawl these pages to gather training data or answer real-time queries are wading through a great deal of noise to find the signal.

llms.txt proposes to put that signal in one place: clean markdown, structured links to key content, a clear hierarchy of what matters. An AI system that supports the standard can go to /llms.txt, get a structured digest of your site, and use that for context rather than wading through rendered HTML.

The format is straightforward. H1 site name, a blockquote description, H2 sections linking to key pages, each with a short description. There is also an llms-full.txt variant that includes full content rather than links. The specification names ecommerce sites explicitly as a use case: product information, policies, FAQs, brand context — everything an AI assistant would need to accurately answer questions about your products.

It is worth being clear about what this is, structurally. llms.txt is not a W3C standard or formally ratified protocol. It is a single-party convention proposed by one person at one organisation. That is worth acknowledging not because it disqualifies it, but because it sets realistic expectations about how adoption will play out. Robots.txt started the same way: an informal convention in 1994, it took 28 years to become RFC 9309. The informal start is not fatal. It is simply how these things begin.

The adoption picture

Early adopters are developer tooling companies and documentation platforms: Anthropic (Claude's API documentation), Cloudflare, Stripe, Vercel. The pattern is telling. The companies that moved fastest are those whose users rely heavily on AI coding assistants, where there is an immediate concrete benefit to giving the assistant clean structured context. A developer asking an AI to help integrate Stripe payments benefits directly from Stripe's llms.txt being well-structured.

Ecommerce has not been an early mover. That is not surprising. The ecommerce use case is different: less about helping developers use your API, more about helping AI assistants answer shopper questions accurately. That use case is real, but it sits further from the incentive that drove early technical adoption.

The honest position as of late 2024 is this: the files are being written; whether they are being read by the AI systems that matter is not confirmed by any major platform. This is the classic standards problem. A convention only has value when both sides implement it. Right now publishers are writing llms.txt files and the major AI platforms have not publicly confirmed they read them.

Whether that changes depends on decisions that will be made at those platforms. Those decisions have not been made yet.

Should ecommerce bother?

Probably yes, but with calibrated expectations rather than urgency.

The case for implementing it is not that it will demonstrably improve your AI visibility right now. There is no confirmed evidence for that claim in late 2024. The case is more conditional.

Implementation is low-cost. A well-structured llms.txt is a few hours of development work, or less if you are already on a platform that generates one. The cost of being wrong about its future importance is minimal.

If major AI platforms do start systematically reading these files (and there is a reasonable argument they will, because clean structured context serves their interests), having a good file already in place means you capture that benefit without a reactive scramble.

The process itself is useful. Building a good llms.txt forces you to think clearly about what content on your site an AI system should prioritise: product information quality, policy documentation, FAQ coverage. These are GEO-relevant concerns regardless of whether the llms.txt mechanism ever achieves wide uptake.

The case against urgency: there are higher-certainty things you could be doing instead. Improving your Product schema markup. Improving machine-readable content quality on product pages. Building Q&A content. Schema.org has been in documented production use at search engines since 2011, backed from the start by Google, Microsoft, Yahoo, and Yandex jointly. That multi-stakeholder origin explains why it has deeper structural integration in AI systems today. llms.txt is one month old. That asymmetry matters for how you prioritise.

llms.txt is the kind of proposal that either becomes infrastructure or becomes a footnote. The low cost of implementation means the right call for most ecommerce businesses is probably to do it, do it well, and then watch what happens with the same engaged scepticism that served us well through the robots.txt era.

If you want to know how that story played out, the answer is in the follow-up piece a year on.

Data sources and further reading

Jeremy Howard's original llms.txt proposal, Answer.AI, September 2024
Official llms.txt specification, llmstxt.org

llms.txt: What Ecommerce Should Know

Topics

structured-data ai-search strategy uk-retail

Sarah Chen

Senior Editor

—14 October 2024

The technical case for it is sound. Whether it will matter in practice is more complicated.

What the proposal is

The adoption picture

Whether that changes depends on decisions that will be made at those platforms. Those decisions have not been made yet.

Should ecommerce bother?

Probably yes, but with calibrated expectations rather than urgency.

The case for implementing it is not that it will demonstrably improve your AI visibility right now. There is no confirmed evidence for that claim in late 2024. The case is more conditional.

If you want to know how that story played out, the answer is in the follow-up piece a year on.

Data sources and further reading

Jeremy Howard's original llms.txt proposal, Answer.AI, September 2024
Official llms.txt specification, llmstxt.org