The Entity Disambiguation Gap: Only 3 of 500 Fortune 500 Link to Wikidata
Published 2026-04-20 · PROGEOLAB Research
The Entity Disambiguation Gap is the distance between the number of Fortune 500 companies that publish JSON-LD structured data (122) and the number that use the one property that lets AI unambiguously identify them: sameAs to Wikidata. That number is three. Apple (Q312), Comcast (Q1113804), and Repsol (Q174747).
Wikidata is the knowledge base AI answer engines use for canonical entity resolution. When a model sees the string "Apple" on a page, Wikidata helps the model decide whether that's the technology company (Q312), the fruit (Q89), or Apple Corps the records label (Q217780). A JSON-LD sameAs link pointing to the right Q-number tells the AI which "Apple" this page represents. Without it, the resolution is probabilistic — which for famous brands usually works and for everyone else often doesn't.
The funnel
- 500 Fortune 500 companies probed
- 388 responded with parseable HTML (the rest unreachable or non-HTML)
- 122 include JSON-LD on the homepage (24.4%) — the entry point to structured data
- 78 use generic Organization type, 15 use Corporation, 9 use Retailer, 4 use BankOrCreditUnion, 16 use other types
- 55 include sameAs (11%) — linking to social profiles, Wikipedia, or other entity IDs
- 3 link to Wikidata (0.6%) — Apple, Comcast, Repsol
The Repsol surprise
Apple and Comcast on the sameAs-to-Wikidata list makes sense — large technology-oriented public companies with mature SEO programs. Repsol is the outlier. A Spanish oil and gas company with no particular reputation for digital infrastructure excellence published JSON-LD with "sameAs": ["https://www.wikidata.org/entity/Q174747"] sometime in 2025. The addition is not in any public Repsol communications; it was discovered in our April 2026 audit.
Repsol's AI Readiness Score is 7 — the highest of any energy company in the Fortune 500 and higher than most US banks. The Wikidata sameAs is one of the signals driving that score. The second is a 44-link llms.txt. An energy company that nobody writes AI-visibility articles about has invested more in AI disambiguation than IBM, Oracle, or Salesforce.
Why schema type matters
The dominance of generic Organization (78 of 122 = 64%) is another disambiguation failure. Organization is appropriate for non-profits, trade associations, and small non-commercial entities — not for the world's largest public corporations. Using Corporation (which inherits additional properties like tickerSymbol and numberOfEmployees) or an industry-specific type (BankOrCreditUnion, Retailer, AutomotiveBusiness) signals more precisely what the entity is.
Only 15 Fortune 500 companies use Corporation. The 107 others either picked the wrong type or picked a type too generic to help AI entity resolution.
The 10-minute fix
Adding sameAs to Wikidata is a 10-minute implementation — look up your Q-number on wikidata.org, add one line to your existing JSON-LD, redeploy. The implementation guide walks through the exact sequence. For most Fortune 500 companies this is a single ticket in a CMS or a one-file PR in a Next.js / Gatsby repo. The competitive window — 99.4% of Fortune 500 have not done it — is fully open.