Frequently Asked Questions
Everything you need to know about AI-readiness, llms.txt, and making your website work with AI agents.
AI-Readiness Basics
AI-readiness measures how well your website's content can be understood, extracted, and used by AI agents like ChatGPT, Claude, and Perplexity. As AI-powered tools become a major source of web traffic, sites that are AI-ready get cited more accurately, appear more often in AI-generated responses, and cost less tokens to process.
Unlike web browsers that render HTML visually, AI agents need to extract text content from your pages. They prefer clean, well-structured content over complex HTML with heavy styling. A well-structured page converted to Markdown uses 70-80% fewer tokens than raw HTML, making it cheaper and more efficient for AI providers.
The major AI crawlers include GPTBot (OpenAI/ChatGPT), ClaudeBot (Anthropic/Claude), PerplexityBot (Perplexity), Google-Extended (Google Gemini), Bytespider (ByteDance), CCBot (Common Crawl), and many more. New AI agents appear regularly as the ecosystem grows.
llms.txt
llms.txt is an emerging standard (defined at llmstxt.org) that helps AI agents understand your website's structure. Similar to how robots.txt guides search engine crawlers, llms.txt provides a Markdown-formatted overview of your site with links to key pages, making it easy for AI agents to navigate your content.
llms.txt is a concise index with a description and links to your site's main pages. llms-full.txt is an extended version that includes the actual content of those pages inline, giving AI agents everything in a single file without needing to follow links. Use llms.txt as a minimum, and llms-full.txt for comprehensive coverage.
Create a text file at your domain root (e.g., example.com/llms.txt) following the llmstxt.org spec. Start with a # heading (your site name), add a blockquote description, then list links organized in sections like ## Documentation and ## Main. AgentReady can generate a recommended llms.txt based on your page analysis.
Markdown for AI
Markdown is the preferred format for AI agents because it preserves content structure (headings, lists, links, emphasis) while eliminating visual markup noise (CSS, JavaScript, layout divs). A Markdown version of your content uses significantly fewer tokens, making it faster and cheaper for AI systems to process.
Content negotiation allows your server to serve different formats of the same page based on the client's Accept header. When an AI agent sends Accept: text/markdown, your server can respond with a Markdown version instead of HTML. This is the most efficient way to serve AI-friendly content without creating separate URLs.
There are two main approaches: (1) Add server logic to detect Accept: text/markdown headers and return Markdown content; (2) Create .md files alongside your pages (e.g., /about.md for /about) and link to them from your llms.txt. AgentReady uses both approaches for its own pages.
Structured Data & JSON-LD
JSON-LD (JavaScript Object Notation for Linked Data) is a way to embed structured data in your pages using Schema.org vocabulary. AI agents use this data to extract factual, machine-readable information like product details, article metadata, organization info, and more — without needing to parse your HTML.
Use the most specific type that matches your content: Article or BlogPosting for articles, Product for product pages, Organization for company pages, FAQPage for FAQ pages, LocalBusiness for local businesses, and WebApplication for web tools. Always include name, description, and relevant properties for your chosen type.
Open Graph tags (og:title, og:description, og:image) provide standardized metadata that both social platforms and AI agents use to understand your page's title, description, and primary image. They're easy to implement and serve as a reliable fallback when other structured data is missing.
robots.txt & AI Bots
robots.txt controls which bots can access your site and which pages they can crawl. AI crawlers like GPTBot and ClaudeBot respect robots.txt directives. If your robots.txt blocks these bots, they won't be able to index your content, which means your site won't appear in AI-generated responses.
To maximize visibility in AI-generated responses, allow at least: GPTBot (OpenAI), ClaudeBot and Claude-Web (Anthropic), PerplexityBot (Perplexity), and Google-Extended (Google Gemini). You can add specific Allow rules for these user agents while maintaining your existing rules for other bots.
Content-Signal is an HTTP header that tells AI agents how they may use your content. For example: Content-Signal: ai-train=yes, search=yes, ai-input=yes signals that your content can be used for AI training, search indexing, and as input for AI responses. This is a newer standard that gives publishers explicit control over AI usage.
AgentReady Scoring
AgentReady fetches your page, extracts the content, and runs 21 individual checks across 5 weighted dimensions. Each check scores 0-100, and the dimensions are combined into an overall score from 0 to 100. You get a letter grade (A-F), detailed breakdown, and prioritized recommendations to improve your score.
The 5 dimensions are: Semantic HTML (20%) — proper use of article, main, headings, and semantic elements; Content Efficiency (25%) — token reduction ratio and content-to-noise ratio; AI Discoverability (25%) — llms.txt, robots.txt, sitemap, and markdown negotiation; Structured Data (15%) — Schema.org, Open Graph, and meta tags; Accessibility (15%) — content without JavaScript, page size, and content position.
Yes! Single-page analysis is completely free with no signup required. You get the full score, recommendations, Markdown conversion, and llms.txt preview. We're currently in beta with a limit of 5 analyses per hour. Full domain crawl and monitoring features are coming soon.
Useful Resources
- llmstxt.org — llms.txt specification
- schema.org — Schema.org vocabulary
- w3.org/TR/json-ld11 — W3C JSON-LD specification
- ogp.me — Open Graph Protocol
- robotstxt.org — robots.txt standard
- commonmark.org — CommonMark Markdown specification
- RFC 7231 — HTTP Content Negotiation