ARC-AGI v2
ARC-AGI-V2 · General reasoning
ARC-AGI v2 is a general reasoning benchmark published in 2024 measuring abstract reasoning over visual grids. Each task requires inferring the transformation rule from 2-3 examples. Contamination risk: low.
What this benchmark measures
Abstract reasoning over visual grids. Each task requires inferring the transformation rule from 2-3 examples.
v2 launched 2024-12 with harder tasks designed to remain unsolvable by pure pattern matching. $1M public prize for >85% on private set.
Claimed scores
No claims have been recorded yet for this benchmark in the Policy Window catalog.
Interpretation guidance
Contamination risk: low
Benchmark items are unlikely to appear in training corpora — scores are credible reflections of underlying capability.
How to cite this benchmark
Use the primary methodology source for academic citations; reference the Policy Window article for the cross-model leaderboard.
- Primary methodology:https://arcprize.org/
- Wiki article:
https://policywindow.org/wiki/arc-agi-v2
Related benchmarks (general reasoning)
- MMLU· 2020 · high contamination
- MMLU-Pro· 2024 · medium contamination
- GPQA Diamond· 2023 · low contamination
References
Take this further — sign up free
Save, compare, or get alerts when ARC-AGI v2 changes. Policy Window is the analyst workbench layered on top of this wiki — free for researchers, civil society, and verified policymakers.