Frontier AI Regulation
v1 vertical — where Policy Window has shipped depth.
Curated view over the catalog rows that map the frontier-AI regulation landscape: foundation-model obligations, compute reporting, agentic-system governance, catastrophic-risk frames, and the canonical debates that govern them. Every row here cites a primary source. No article body is LLM-written. Permanent citations via ?asOf=YYYY-MM-DD.
Topics in this vertical (6)
Foundation Models / GPAI
Obligations specific to general-purpose / foundation models above certain capability thresholds.
Compute-Threshold Reporting
Mandatory reporting based on training-compute or capability thresholds.
Transparency Obligations
Disclosure of training data, model cards, system-card requirements.
Individual Redress
Right to explanation, appeal mechanisms, complaint channels.
Catastrophic & Existential Risk
Governance of model capabilities that could cause mass casualties or civilisational-scale harms (CBRN uplift, autonomous replication, deceptive alignment). Distinct from EU AIA 'systemic risk' which targets market-scale rather than catastrophic-scale harms.
Agentic AI Governance
Obligations specific to AI systems that take autonomous multi-step actions (browse, transact, plan, recurse). Distinct from foundation_models (capability) and catastrophic_risk (outcome) — this is the action-surface frame. Surfaces in EU AI Office GPAI Code drafts, UK AISI agent evaluations, Seoul Frontier AI Safety Commitments §3, NIST AI 600-1.
Top instruments by frontier-coverage density
Ranked by number of governs-type cells across the frontier topics above. Click to read the full instrument article.
- EU AI Act · EU · governs 4 frontier topics
- Seoul Declaration on Safe, Innovative and Inclusive AI · global · governs 4 frontier topics
- NIST AI RMF Generative AI Profile · US · governs 4 frontier topics
- Brazil AI Bill (PL 2338/2023) · BR · governs 4 frontier topics
- Anthropic Responsible Scaling Policy (RSP) v2 · US · governs 4 frontier topics
- Executive Order 14110 on Safe, Secure, Trustworthy AI · US · governs 3 frontier topics
- G7 Hiroshima AI Process Code of Conduct · G7 · governs 3 frontier topics
- California SB-1047: Safe and Secure Innovation for Frontier AI Models Act · US · governs 3 frontier topics
Frontier-safety concepts (28)
- Frontier-Tier AI
- AI Safety Level 3 (ASL-3)
- Systemic Risk (AI)
- Designated Systemic-Risk Model
- Compute Threshold (AI Governance)
- Red-Team Evaluation
- AI Alignment
- Deceptive Alignment
- Mesa-Optimization
- Scalable Oversight
- Capability Elicitation
- Dual-Use Research Norms (DURC for AI)
- Provenance & Watermarking
- AI Supply Chain
- Training-Data Attribution
- Prompt Injection
- Agentic AI System
- Tool-Use Safety
- Multi-Turn Evaluation
- Data Poisoning
- Model Distillation Risk
- Jailbreak Resistance
- Model-Merging Risk
- Inference-Time Compute
And 4 more — see the full concept index.
Capability benchmarks (10)
What we measure when we measure frontier capability. Each benchmark page documents methodology + score leaderboard + contamination risk.
- SWE-bench Verified · agentic
- MMLU · general_reasoning
- MMLU-Pro · general_reasoning
- GPQA Diamond · general_reasoning
- ARC-AGI v2 · general_reasoning
- HumanEval · code
- MATH (Hendrycks) · math
- AIME 2024 · math
- Humanity's Last Exam · knowledge
- FrontierMath · math
Canonical debates (5)
Structured controversies where the frontier-regulation decision space is contested. Each debate page lays out the competing positions with primary-source citations for each side.
- Open-Source vs Closed-Source Frontier Models — Should the most-capable AI models be released under permissive licenses (open weights), or only via API / structured-access agreements? The dispute is foundational to nearly every frontier-AI governance instrument.
- Pause AI vs Accelerate Capabilities — Should the global community impose temporary or capability-conditional pauses on frontier-AI development, or should development accelerate with safety work conducted in parallel?
- Pre-Deployment Red-Team vs Post-Deployment Audit — Should AI capability + safety evaluations happen primarily before deployment (red-team gating release), or primarily after (post-deployment audit + incident response)?
- Risk-Based vs Principles-Based vs Ex-Post Liability Regimes — Should AI governance work via (a) risk-based ex-ante categorisation + obligations (EU), (b) high-level principles delegated to sector regulators (UK / OECD / G7), or (c) ex-post liability + civil litigation (US sectoral)?
- Compute vs Behavioural Capability Thresholds — Should the regulatory trigger for 'frontier' / 'foundation' / 'systemic-risk' status be training-compute thresholds (objective + ex-ante observable), or behavioural-capability evaluation (more semantically meaningful but operationally costly)?