Analysis & Opinion

Tech Giants Blocking AI: The Industry That Built It, Blocks It Most

Published 2026-04-20 · PROGEOLAB Research

The tech-giant AI-blocking irony is the pattern where companies that build AI infrastructure or sell software products are the most aggressive at blocking AI crawlers. Across the Fortune 500, software is the sector with the highest AI-blocking rate — 50% of the small sample — while the companies that supply AI training compute (Nvidia, Dell) are among the most AI-visible. The industry that built it, blocks it most.

Across 15 Fortune 500 technology companies sampled in our audit, the distribution is stark:

Company	Chrome	ChatGPT	robots.txt AI	Score
Dell	58	59	Allow (1 bot)	10
Apple	14	14	—	8
HP	8	7	Allow (10 bots)	8
SAP	2	2	Allow (13 bots)	7
Meta	2	14	Allow (8 bots)	5
Salesforce	13	0	None	2
IBM	12	0	None	2
Amazon	—	—	Block (16 bots)	1
Oracle	13	0	None	0

Amazon: the most thorough block

Amazon's robots.txt is 5,888 bytes with 48 User-agent sections. Sixteen of those sections name AI-specific crawlers, every one with Disallow: /. No other Fortune 500 company approaches this thoroughness. Amazon also has no JSON-LD, no llms.txt, no sameAs links — the AI-visibility score is 1, and the point comes from the fact that Amazon.com is accessible to Chrome.

The strategic logic is internally consistent: Amazon's competitive advantage is the transaction, not the citation. When AI extracts and redistributes pricing, reviews, and availability, consumers may get answers without visiting Amazon. Every AI-mediated comparison that doesn't end with a click to amazon.com is a potential lost transaction.

Salesforce: the 205-link content directory nobody can read

Salesforce maintains one of the largest llms.txt files in the Fortune 500 — 206 lines, 205 curated links organized across product, developer, and enterprise documentation. The file was body-validated in our llms.txt adoption audit. Yet Salesforce blocks ChatGPT-User entirely (0 of 13 endpoints return content). The content directory exists for an audience that cannot read it.

This is the sharpest example of the Content-Access Contradiction documented in the pillar guide: marketing invested in AI-specific content; security blocked AI access; no cross-team review caught the conflict.

Meta: allow the crawlers, break the server

Meta's robots.txt explicitly allows 8 AI crawlers (GPTBot, ChatGPT-User, ClaudeBot, PerplexityBot, Google-Extended, Applebot-Extended, meta-externalagent, FacebookBot). But actual access is strange: Chrome reaches 2 of 64 endpoints; ChatGPT-User reaches 14. Meta's Cloudflare deployment appears to treat ChatGPT-User more permissively than a datacenter-Chrome — possibly because Meta explicitly whitelisted OpenAI IP ranges without doing the same for generic datacenter Chromium traffic. Unintentional, but favourable to AI.

Dell, HP, SAP: tech's AI-native minority

Dell's 10/12 score comes from a 131-link llms.txt, JSON-LD, explicit Bytespider allow in robots.txt, and 59-of-64 ChatGPT accessibility. HP explicitly allows 10 AI bots and provides regional sitemaps for AI systems. SAP names 13 AI bots in robots.txt and runs security.txt. None of the three markets itself as AI-first; they simply haven't configured WAF rules that conflict with their content policy.

The companies that most loudly sell AI-powered products — Salesforce Einstein, Oracle Cloud AI, IBM Watson — are the ones whose web properties are most hostile to AI retrieval. The companies selling metal, chips, and servers are the ones whose web properties are most open. The pattern doesn't prove a thesis about AI product messaging; it does show that AI-visibility outcomes are uncorrelated with AI-brand marketing.

Key takeaways

Software sector blocks at 50% The highest AI-blocking rate of any Fortune 500 industry. Oracle, IBM, Salesforce all score 0-2 on AI-Readiness
Amazon: 16 AI bots blocked by name The most comprehensive AI-blocking policy in the Fortune 500 — plus AWS WAF enforcement
Meta allows 8 AI bots Only 2 of 64 Chrome endpoints respond, but ChatGPT-User gets 14 — opposite of the sector pattern
Dell leads tech at 10/12 131-link llms.txt, JSON-LD, explicit Bytespider allow, full ChatGPT accessibility — the quiet tech champion
Salesforce contradiction 205-link llms.txt (second-largest in Fortune 500) but blocks ChatGPT-User. The content directory nobody can read