Platform Research Insights Glossary Tools Compare FAQ Request Demo
Data Snapshots

The Entity Disambiguation Gap: Only 3 of 500 Fortune 500 Link to Wikidata

Published 2026-04-20 · PROGEOLAB Research

The Entity Disambiguation Gap is the distance between the number of Fortune 500 companies that publish JSON-LD structured data (122) and the number that use the one property that lets AI unambiguously identify them: sameAs to Wikidata. That number is three. Apple (Q312), Comcast (Q1113804), and Repsol (Q174747).

Wikidata is the knowledge base AI answer engines use for canonical entity resolution. When a model sees the string "Apple" on a page, Wikidata helps the model decide whether that's the technology company (Q312), the fruit (Q89), or Apple Corps the records label (Q217780). A JSON-LD sameAs link pointing to the right Q-number tells the AI which "Apple" this page represents. Without it, the resolution is probabilistic — which for famous brands usually works and for everyone else often doesn't.

Entity disambiguation funnel: 500 companies → 122 with JSON-LD → 55 with sameAs → 3 with Wikidata
Figure 1 · The entity disambiguation funnel across the Fortune 500. Source: PROGEOLAB, April 2026.

The funnel

  • 500 Fortune 500 companies probed
  • 388 responded with parseable HTML (the rest unreachable or non-HTML)
  • 122 include JSON-LD on the homepage (24.4%) — the entry point to structured data
  • 78 use generic Organization type, 15 use Corporation, 9 use Retailer, 4 use BankOrCreditUnion, 16 use other types
  • 55 include sameAs (11%) — linking to social profiles, Wikipedia, or other entity IDs
  • 3 link to Wikidata (0.6%) — Apple, Comcast, Repsol

The Repsol surprise

Apple and Comcast on the sameAs-to-Wikidata list makes sense — large technology-oriented public companies with mature SEO programs. Repsol is the outlier. A Spanish oil and gas company with no particular reputation for digital infrastructure excellence published JSON-LD with "sameAs": ["https://www.wikidata.org/entity/Q174747"] sometime in 2025. The addition is not in any public Repsol communications; it was discovered in our April 2026 audit.

Repsol's AI Readiness Score is 7 — the highest of any energy company in the Fortune 500 and higher than most US banks. The Wikidata sameAs is one of the signals driving that score. The second is a 44-link llms.txt. An energy company that nobody writes AI-visibility articles about has invested more in AI disambiguation than IBM, Oracle, or Salesforce.

Why schema type matters

JSON-LD type distribution across 122 Fortune 500 companies
Figure 2 · JSON-LD type distribution. Generic Organization dominates; richer types are rare. Source: PROGEOLAB, April 2026.

The dominance of generic Organization (78 of 122 = 64%) is another disambiguation failure. Organization is appropriate for non-profits, trade associations, and small non-commercial entities — not for the world's largest public corporations. Using Corporation (which inherits additional properties like tickerSymbol and numberOfEmployees) or an industry-specific type (BankOrCreditUnion, Retailer, AutomotiveBusiness) signals more precisely what the entity is.

Only 15 Fortune 500 companies use Corporation. The 107 others either picked the wrong type or picked a type too generic to help AI entity resolution.

The 10-minute fix

Adding sameAs to Wikidata is a 10-minute implementation — look up your Q-number on wikidata.org, add one line to your existing JSON-LD, redeploy. The implementation guide walks through the exact sequence. For most Fortune 500 companies this is a single ticket in a CMS or a one-file PR in a Next.js / Gatsby repo. The competitive window — 99.4% of Fortune 500 have not done it — is fully open.