Print-friendly view · use your browser's Save as PDF option (Cmd/Ctrl-P) to attach this article to a brief.
FrontierMath
FRONTIER-MATH · math benchmark · 2024
Source: https://policywindow.org/wiki/frontiermath
Generated 2026-05-30T22:07:29 UTC
Summary
Hundreds of original research-mathematician-curated math problems requiring deep reasoning. Held-out evaluation only.
At a glance
- Score range
- 0–100 % accuracy
- Contamination risk
- low
- Methodology URL
- https://epochai.org/frontiermath
- Saturation status
- active
Details
Epoch AI eval. Top reasoning models 2-5% at launch; OpenAI o3-preview reported 25% under custom harness.
How to cite this article
APA
Policy Window. (2024). FrontierMath [Wiki article — Benchmark]. https://policywindow.org/wiki/frontiermath
Chicago
Policy Window. 2024. "FrontierMath." Wiki article (Benchmark). https://policywindow.org/wiki/frontiermath.
Harvard
Policy Window (2024) 'FrontierMath', Wiki article — Benchmark, available at: https://policywindow.org/wiki/frontiermath.
OSCOLA
Policy Window, 'FrontierMath' (Wiki article — Benchmark, 2024) <https://policywindow.org/wiki/frontiermath> accessed [date].
BibTeX
@misc{policywindow-frontiermath,
title = {FrontierMath},
author = {Policy Window},
year = {2024},
howpublished = {FRONTIER-MATH (2024)},
url = {https://policywindow.org/wiki/frontiermath},
note = {Primary source: https://epochai.org/frontiermath}
}