AIME 2024

AIME-2024 · Mathematical reasoning

Live · 2024

AIME 2024 is a mathematical reasoning benchmark published in 2024 measuring 30 problems from the 2024 American Invitational Mathematics Examination — high-school competition math. Contamination risk: low.

What this benchmark measures

30 problems from the 2024 American Invitational Mathematics Examination — high-school competition math.

Released after most current models' training cutoffs. Top reasoning models 75-90%; non-reasoning 10-30%.

Claimed scores

ModelScoreClaim typeReportedCitation
gpt-594.6 % accuracyvendor card2025-08-07OpenAI release

Interpretation guidance

Contamination risk: low

Benchmark items are unlikely to appear in training corpora — scores are credible reflections of underlying capability.

How to cite this benchmark

Use the primary methodology source for academic citations; reference the Policy Window article for the cross-model leaderboard.

Related benchmarks (mathematical reasoning)

References

  1. AIME 2024 methodology
  2. gpt-5 — 94.6 % accuracy (OpenAI release, 2025-08-07)

Take this further — sign up free

Save, compare, or get alerts when AIME 2024 changes. Policy Window is the analyst workbench layered on top of this wiki — free for researchers, civil society, and verified policymakers.

Generated from the Policy Window catalog at . Each claim cites the originating primary source.

Wiki articles regenerate when the underlying catalog updates. Tracked revisions arrive in a future iteration; subscribe via the CTA above to be notified when this article changes.