About

About This Project

Close, But How Close? is a research project by Sam Donahue, a SPAR Fellow at the Kairos Foundation, that systematically quantifies the gap between self-reported AI model benchmark scores and independently measured third-party evaluations, with a focus on understanding US-China frontier AI capabilities.

What SPAR Is

SPAR (Supervised Program for Alignment Research) is a research fellowship program that supports rigorous empirical work on AI safety and governance. This project addresses a core challenge in AI policy: the lack of reliable, cross-validated data on comparative AI capabilities.

Key Findings

  • The US-China frontier AI capability gap is 1.4% – 6.8% depending on methodology (as of February 2026)
  • Self-reported benchmark scores are biased upward by +5.0 percentage points on average
  • This bias is statistically indistinguishable between US and Chinese labs (Mann-Whitney p = 0.13)
  • The gap has narrowed substantially since 2023, but recent trend direction is uncertain
  • Measurement is harder than commonly acknowledged — only 15 of 400+ benchmarks have sufficient cross-validated data

Citation

If you use this data or findings in your work, please cite:

@article{donahue2026close,
  title={Close, But How Close? Quantifying the US-China AI
         Capabilities Gap Across Multiple Evaluation Sources},
  author={Donahue, Sam},
  year={2026},
  institution={Kairos Foundation / SPAR}
}

Data & Code

All data powering this website is derived from three public sources: Artificial Analysis, Epoch AI, and LLM Stats. The matching pipeline and analysis code are available on GitHub.

Contact

For questions, feedback, or collaboration, reach out to Sam Donahue via the Kairos Foundation.