What is llms.txt and should you worry about it?

What is llms.txt and should you worry about it?

Developers and marketers are to add llms.txt files to their sites to assist gigantic language models (LLM) “understand” their content.

But what exactly is llms.txt, who uses it and – more importantly – should you care?

LLMS.TXT is a proposed standard for helping LLMS in access and interpretation of structural content from websites. You can read the full proposal llmstext.org.

In low, it is a text file designed to inform LLM where to find good things: API documentation, reimbursement rules, product taxonom and other resources prosperous in context. The goal is to remove ambiguity by giving language models a selected map of high value content so that they do not have to guess what is significant.

Screenshot from the proposed standard at https://llmstxt.org/.

Theoretically it sounds like a good idea. We already employ files such as robots.txt and sitap.xml to assist search engines understand what is on the site and where to look. Why not apply the same logic to LLMS?

But what is significant No LLM Supplier currently supports LLMS.txt. Not OpenAi. Not anthropic. Not Google.

As I said in the introduction, llms.txt is proposed standard. I could also propose the standard (let’s call it, Send-Me-Robot-Robot-Overlords.txt), but unless the main suppliers of LLM do not agree to employ it, it is quite irrelevant.

This is where we are with llms.txt: it is a speculative idea without official adoption.

Do not sleep on robots.txt

Llms.txt may not affect your online visibility, but robots.txt definitely yes.

You can employ the AHREFS audit to monitor hundreds of typical SEO technical problems, including problems with robots.txt files that can seriously hinder your visibility (or even stop the site in front of the tank).

Here’s what the llms.txt file looks like in practice. This is a screenshot Real llms.txt anthropic file:

At the base of Llms.txt is Price reduction document (type of specially formatted text file). Uses H2 headers to organize links to key resources. Here is an example structure that you can employ:

# llms.txt
## Docs
- /api.md
A summary of API methods, authentication, rate limits, and example requests.
- /quickstart.md
A setup guide to assist developers start using the platform quickly.
## Policies
- /terms.md
Legal terms outlining service usage.
- /returns.md
Information about return eligibility and processing.
## Products
- /catalog.md
A structured index of product categories, SKUs, and metadata.
- /sizing-guide.md
A reference guide for product sizing across categories.

You can do your own llms.txt in minutes:

  1. Start with the basic Markdown file.
  2. Employ H2S to group resources by type.
  3. Link to a structural, genial brand.
  4. Notify it on a regular basis.
  5. Host him in your main domain: https://yourdomain.com/llms.txt

You can create it yourself or employ a free LLMS.txt generator (like this) To do this for you.

I have read about some programmers who also experiment with specific metadata for LLM in their llms.txt files, such as token budgets or preferred file formats (but there is no evidence that Crawlers or LLM Models respects it).

You can see a list of companies using llms.txt on Directory.llmStxt.cloud—S-retained community of the public llms.txt files.

Here are some examples:

  • Mintifers: Programmers’ documentation platform.
  • Tinybird: Real -time data.
  • Cloudflare: Lists performance and safety documents.
  • Anthropic: Publishes a full map of the characters of his API documents.

But what about huge players?

Yet, No LLM Supplier has formally accepted llms.txt As part of their Crawler protocol:

  • OpenAI (GPTBOT): Honors robots.txt, but he officially doesn’t employ llms.txt.
  • Anthropic (Claude): He publishes his own llms.txt, but does not find that his creep uses the standard.
  • Google (Gemini/Bard): Uses robots.txt (by user-agent: Google-Extended) to manage the behavior of Crawl AI, without mention of llms.txt support.
  • Goal (flame): Lack of public slim or tips and there are no tips for the employ of LLMS.txt.

This emphasizes the significant issue: the creation of LLMS.txt is not the same as enforcing it in pre -political behavior. At the moment, most LLM suppliers treat LLMS.txt as an fascinating idea, not something that priorities and tracking have agreed.

Is llms.txt really useful?

Not in my opinion, not yet.

There is no evidence that LLMS.Txt improves AI search, increases movement or increases the accuracy of the model. And no supplier has committed to analyzing him.

But it is also very straightforward to configure. If you already have a content structure, such as product pages or programming documents, llms.txt compilation is minor. This is a Markdown file, hosted on your site. There may not be an observed benefit, but there is also no risk. If LLM ultimately observes it as a standard, being early users can have a tiny advantage.

I think Llms.txt gains grip because we all want to influence LLM visibility, but we lack the tools for it. So we stop for ideas feel Like control.

But in my personal opinion in LLMS.txt is a solution in search of a problem. Search engines are already crawling and understanding your content using existing standards, such as robots.txt and sitap.xml. LLM uses most of the same infrastructure.

As John Mueller Google put it Reddit post recently:

AFIK None of the AI ​​services said that she is using LLMS.txt (and you can say when you look at the server diaries that they don’t even check it). For me it is comparable to the metal tag of keywords-this is what the website owner claims that their site concerns … (is the page really like that? Well, you can check it. Why not check the page directly?)

John MuellerJohn Mueller

Do you disagree with me or want to share the opposite example? Write to me LinkedIn Or X.

Leave a Reply

Your email address will not be published. Required fields are marked *