Print-friendly view · use your browser's Save as PDF option (Cmd/Ctrl-P) to attach this article to a brief.

FrontierMath

FRONTIER-MATH · math benchmark · 2024

Source: https://policywindow.org/wiki/frontiermath

Generated 2026-05-30T22:07:29 UTC

Summary

Hundreds of original research-mathematician-curated math problems requiring deep reasoning. Held-out evaluation only.

At a glance

Score range: 0–100 % accuracy
Contamination risk: low
Methodology URL: https://epochai.org/frontiermath
Saturation status: active

Details

Epoch AI eval. Top reasoning models 2-5% at launch; OpenAI o3-preview reported 25% under custom harness.

How to cite this article

APA

Policy Window. (2024). FrontierMath [Wiki article — Benchmark]. https://policywindow.org/wiki/frontiermath

Chicago

Policy Window. 2024. "FrontierMath." Wiki article (Benchmark). https://policywindow.org/wiki/frontiermath.

Harvard

Policy Window (2024) 'FrontierMath', Wiki article — Benchmark, available at: https://policywindow.org/wiki/frontiermath.

OSCOLA

Policy Window, 'FrontierMath' (Wiki article — Benchmark, 2024) <https://policywindow.org/wiki/frontiermath> accessed [date].

BibTeX

@misc{policywindow-frontiermath,
  title  = {FrontierMath},
  author = {Policy Window},
  year   = {2024},
  howpublished = {FRONTIER-MATH (2024)},
  url    = {https://policywindow.org/wiki/frontiermath},
  note   = {Primary source: https://epochai.org/frontiermath}
}