Guides & How-To

The AI Visibility Checklist: 53 Data-Backed Checks with Fortune 500 Benchmarks

Published 2026-04-20 · PROGEOLAB Research

The PROGEOLAB 53-Point AI Visibility Checklist is an audit-ready list of checks covering every known signal that determines whether AI answer engines can access, understand, and cite your website's content. Each item has a Fortune 500 benchmark attached: the check passes when you match or exceed the adoption rate of the top-25 AI-ready companies.

Use it as a pre-launch audit, a quarterly review, or a gap analysis against the Fortune 500 leaders. The checklist is grouped into six dimensions. Higher-impact items are listed first within each dimension.

Dimension 1 — Access (9 checks)

Homepage returns HTTP 200 to Chrome UA — 70.4% of Fortune 500 pass
Homepage returns HTTP 200 to ChatGPT-User UA — 60% pass
Homepage returns HTTP 200 to PerplexityBot UA
Homepage returns HTTP 200 to ClaudeBot UA
Homepage returns HTTP 200 to Googlebot UA (reverse-DNS verifiable)
No Layer-2 datacenter IP blocking for known AI crawler IPs — 95.2% pass (24 fail)
No Layer-3 TLS fingerprinting that rejects non-browser clients — 97% pass (15 fail)
Server does not return challenge pages (JavaScript-only responses) to AI UAs
No geographic blocking preventing AI crawlers based in US/EU datacenters from reaching the site

Dimension 2 — Bot Policy (9 checks)

robots.txt exists and is reachable at /robots.txt
robots.txt names at least 5 AI crawlers explicitly — 7.5% pass
robots.txt distinguishes training crawlers from retrieval crawlers — 0% pass (first mover available)
robots.txt includes GPTBot directive (Allow or Disallow)
robots.txt includes ChatGPT-User directive
robots.txt includes ClaudeBot directive
robots.txt includes Google-Extended directive
robots.txt declares a sitemap URL
robots.txt policy matches WAF behavior (no declaration-enforcement gap)

Dimension 3 — Standards (10 checks)

llms.txt exists at domain root and is reachable — 2.8% pass (body-validated)
llms.txt contains at least 20 URLs organized in sections
llms.txt Content-Type is text/plain or text/markdown (not HTML)
llms.txt is reachable to ChatGPT-User specifically (not just Chrome)
security.txt follows RFC 9116 format — 15% pass
security.txt includes Contact and Expires fields
security.txt includes PGP-signed URL (optional but signals maturity)
sitemap.xml reachable and current
ads.txt present if running programmatic advertising
No soft-404 pages at AI-standard paths (ai.txt, agents.json, mcp.json)

Dimension 4 — Structured Data (9 checks)

JSON-LD present on homepage — 24.4% pass
JSON-LD uses Corporation type (or more specific) not generic Organization
JSON-LD includes legalName, url, logo fields
JSON-LD includes numberOfEmployees — 8.7% pass
JSON-LD includes foundingDate
JSON-LD includes sameAs array
sameAs includes Wikidata QID URL — 0.6% pass (Apple, Comcast, Repsol)
sameAs includes Wikipedia entity URL
sameAs includes verified social media profiles (LinkedIn, X)

Dimension 5 — Content (8 checks)

Homepage title tag follows Brand + Descriptor format — brand-only titles fail the check
Homepage meta description present and 120-160 chars — 148 Fortune 500 missing meta descriptions
Homepage has semantic H1 with company name or value proposition
Homepage text-to-code ratio exceeds 20% — Fortune 500 average 34%
Content is server-rendered, not JS-only — 9 Fortune 500 homepages fail
Pricing, availability, or specs are in HTML (not just images or PDFs)
Product documentation uses H2/H3 hierarchy extractable by AI
FAQ pages use FAQPage schema (if FAQs exist)

Dimension 6 — Technical (8 checks)

HTTPS-only, no HTTP-to-HTTPS mixed content
Canonical URLs consistent (no http/https or www/non-www split)
hreflang tags for international sites
Server response time under 2s from common datacenter locations
No JavaScript-only rendering for critical content paths
Open Graph meta tags present (og:title, og:description, og:image)
Twitter/X Card meta tags present
AI crawler verification test runs quarterly (curl against all 4 UAs from non-corporate IP)

Scoring

Each check is binary (pass / fail). Sum the passes, divide by 53, multiply by 100. Adjust weighting if you want: the Access dimension matters 3× more than Technical on our internal scoring.

0-10 — Level 0, Invisible. Most Fortune 500 sit here accidentally.
11-25 — Level 1, Present but unoptimized. Chrome works, nothing else.
26-40 — Level 2, Partial. Some JSON-LD, some robots.txt, WAF-WAF contradictions likely.
41-60 — Level 3, Optimized. The 5-hour transformation lands here.
61-80 — Level 4, AI-Ready. Top-25 Fortune 500 tier.
81-100 — Level 5, AI-First. No Fortune 500 currently reaches this.

Key takeaways

6 dimensions, 53 checks Access (9) · Policy (9) · Standards (10) · Structured Data (9) · Content (8) · Technical (8)
Every item benchmarked Pass/fail criteria anchored to Fortune 500 data — 'pass' means you match or exceed Nvidia, Dell, Volkswagen tier
Scoring produces a 0-100 53 items weighted by importance. Most Fortune 500 companies score 15-25; the top-25 average 65+
Audit ladder Level 0 (0-10 score) → Level 1 (11-25) → Level 2 (26-40) → Level 3 (41-60) → Level 4 (61-80) → Level 5 (81-100)
5-hour transformation moves L0 to L3 10 highest-weighted items cover 60% of the score. The checklist sequences the work from most to least impactful