AIME 2024
AIME-2024 · Mathematical reasoning
AIME 2024 is a mathematical reasoning benchmark published in 2024 measuring 30 problems from the 2024 American Invitational Mathematics Examination — high-school competition math. Contamination risk: low.
What this benchmark measures
30 problems from the 2024 American Invitational Mathematics Examination — high-school competition math.
Released after most current models' training cutoffs. Top reasoning models 75-90%; non-reasoning 10-30%.
Claimed scores
| Model | Score | Claim type | Reported | Citation |
|---|---|---|---|---|
| gpt-5 | 94.6 % accuracy | vendor card | 2025-08-07 | OpenAI release |
Interpretation guidance
Contamination risk: low
Benchmark items are unlikely to appear in training corpora — scores are credible reflections of underlying capability.
How to cite this benchmark
Use the primary methodology source for academic citations; reference the Policy Window article for the cross-model leaderboard.
- Primary methodology:https://www.maa.org/math-competitions/american-invitational-mathematics-examination-aime
- Wiki article:
https://policywindow.org/wiki/aime-2024
Related benchmarks (mathematical reasoning)
- MATH (Hendrycks)· 2021 · medium contamination
- FrontierMath· 2024 · low contamination
References
Take this further — sign up free
Save, compare, or get alerts when AIME 2024 changes. Policy Window is the analyst workbench layered on top of this wiki — free for researchers, civil society, and verified policymakers.