Living evidence map · scoping-review idiom
Italy Law No. 132/2025 on Artificial Intelligence (Legge 23 settembre 2025, n. 132)
IT-AILAW-2025 · IT
Italy's Law No. 132/2025 ("Disposizioni e deleghe al Governo in materia di intelligenza artificiale") is the first organic national AI statute adopted by an EU member state. It was adopted 23 September 2025, published in Gazzetta Ufficiale Serie Generale n. 223 on 25 September 2025, and entered into force 10 October 2025. It does not replace the EU AI Act (Reg. (EU) 2024/1689): Art. 1(2) requires the law to be interpreted and applied in conformity with that Regulation, and Art. 2 imports the AI-system/AI-model definitions from it. The Act is part principles-and-sector statute, part delegation (delega) to the Government. Capo I sets human-centric principles (Arts. 1–6), including an explicit national-security/defence/intelligence/cybersecurity carve-out from the law's scope (Art. 6) and a parental-consent rule for under-14 access (Art. 4(4)). Capo II adds sector rules: healthcare (Art. 7 — non-discrimination in access, patient information, human medical decision reserved), labour (Art. 11 — transparency and worker-notification duties + Art. 12 workplace-AI Observatory), intellectual professions (Art. 13), public administration (Art. 14), and the judiciary (Art. 15 — interpretation, fact/evidence evaluation and adoption of measures reserved exclusively to the magistrate). Capo III governs national strategy and authorities, designating AgID and ACN as the national AI authorities (Art. 20), with Banca d'Italia/CONSOB/IVASS as market-surveillance authorities. Art. 23 funds investment in AI/cybersecurity/quantum; Arts. 16 and 24 delegate organic decrees (incl. training-data rules and EU-AI-Act alignment) within 12 months. Capo IV recognises copyright in AI-assisted works requiring the author's human intellectual contribution and adds a text-and-data-mining provision (Art. 25; new Art. 70-septies l. 633/1941). Capo V adds criminal provisions, notably a new offence of illicit dissemination of AI-generated/altered content — deepfakes — punishable by 1–5 years (Art. 26; new Art. 612-quater c.p.), plus AI aggravating circumstances. The Italian primary text was read verbatim; English provision excerpts are marked isParaphrase where they render the Italian.
Background & scope
Italy Law No. 132/2025 on Artificial Intelligence (Legge 23 settembre 2025, n. 132) addresses 8 contested AI-governance topics explicitly, 7 via general principles.
Provisions & coverage
- governsDeepfakes / Synthetic Content
Art. 26(1)(c) → c.p. Art. 612-quater[1] - governsAI in Employment
Art. 11(2)-(3)[1] - governsAI in Healthcare
Art. 7(2),(3),(5)[1] - governsAI in Criminal Justice
Art. 15(1)[1] - implicitAI in Education
Art. 24(2)(g),(i)[1] - governsTransparency Obligations
Art. 4(3); Art. 13(2)[1] - implicitIndividual Redress
Art. 4(3); Art. 16(3)(b)[1] - governsTraining-Data Rights
Art. 25(1)(b) → Art. 70-septies; Art. 16[1] - implicitSovereign AI Doctrine
Art. 5; Art. 19; Art. 23[1] - governsTechnological Sovereignty
Art. 5(1)(a),(d)[1] - implicitInternational Coordination
Art. 19(3); Art. 20(2)[1] - implicitSynthetic Content Provenance
Art. 26(1)(c); Art. 4[1] - implicit
- governs
- implicit
Enforcement & impact
Cross-jurisdiction comparison
How peer instruments treat the topics Italy Law No. 132/2025 on Artificial Intelligence (Legge 23 settembre 2025, n. 132) governs.
| Topic | EU-AIA-2024 | US-EO-14110 | US-EO-14179 | UK-WHITEPAPER-2023 | CN-GENAI-2023 | G7-HIROSHIMA | OECD-AI-PRIN | COE-AI-CONV | UN-RES-2024 | NIST-AI-RMF | BLETCHLEY-2023 | SEOUL-2024 | NIST-AI-RMF-GENAI | CA-SB-1047 | IN-DPDP-2023 | BR-AIBILL-2024 | ASEAN-AI-GUIDE-2024 | AU-AI-STRATEGY-2024 | ANTHROPIC-RSP-2024° | OPENAI-PREPAREDNESS-2023° | DEEPMIND-FSF-2024° | META-FRONTIER-2024° | UK-US-AISI-MOU-2024 | WH-VOLUNTARY-2023 | SG-MODEL-AI-2024 | JP-METI-AI-2024 | EU-GDPR-2016 | EU-GPAI-COP-2025 | OMB-M-24-10 | GSA-AI-GUIDE-2024 | DOD-RAI-2022 | FEDRAMP-AI-2024 | DFARS-252-204 | CA-SB-53 | CA-SB-243 | CA-SB-942 | EU-PLD-2024 | UNESCO-AI-ETHICS-2021 | EU-PWD-2024 | CN-DEEPSYN-2022 | NY-RAISE-2025 | US-TAKEITDOWN-2025 | JP-AIPROMO-2025 | UN-GDC-2024 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Deepfakes / Synthetic Content | governs | governs | silent | silent | governs | governs | silent | silent | implicit | implicit | silent | silent | governs | silent | governs | silent | silent | silent | silent | silent | silent | silent | silent | governs | governs | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | implicit | silent | silent | silent | governs | silent | governs | silent | silent |
| AI in Employment | governs | implicit | silent | implicit | silent | silent | silent | implicit | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | implicit | silent | silent | silent | silent | silent | silent | silent | silent | governs | governs | silent | silent | silent | silent | silent |
| AI in Healthcare | governs | implicit | silent | implicit | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | implicit | silent | silent | silent | silent | silent | implicit | silent | silent | governs | silent | silent | silent | silent | silent | silent |
| AI in Criminal Justice | governs | governs | silent | implicit | silent | silent | silent | governs | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | implicit | silent | silent | silent | silent | silent | silent |
| Transparency Obligations | governs | implicit | silent | implicit | conflicts | governs | governs | governs | implicit | governs | implicit | governs | governs | implicit | implicit | governs | governs | silent | governs | implicit | implicit | governs | implicit | governs | governs | governs | governs | governs | governs | governs | governs | governs | silent | governs | governs | governs | implicit | governs | governs | governs | governs | silent | governs | governs |
| Training-Data Rights | implicit | silent | silent | silent | governs | silent | silent | implicit | silent | implicit | silent | silent | governs | silent | governs | implicit | silent | implicit | silent | silent | silent | implicit | silent | silent | silent | implicit | governs | governs | silent | implicit | silent | implicit | governs | silent | silent | silent | silent | governs | silent | governs | silent | silent | implicit | implicit |
| Technological Sovereignty | implicit | governs | silent | implicit | governs | implicit | silent | silent | implicit | silent | silent | silent | silent | silent | silent | silent | implicit | governs | silent | silent | silent | silent | silent | silent | implicit | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | implicit | silent |
| National Security Carveouts in AI Regulation | governs | governs | silent | implicit | silent | silent | silent | governs | silent | silent | silent | silent | silent | silent | implicit | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | implicit | governs | implicit | governs | silent | silent | silent | silent | silent | silent | implicit | silent | silent | implicit | silent |
°= industry self-imposed voluntary framework. Comparing a voluntary code's "governs" tint with a binding regulation's "governs" tint flattens the legal-force distinction; use the instrument-page banner for the operative status of each.
See also
Per-audience views
- Provisions →Article-by-article obligation breakdown for procurement + RFP authors.
- Disclosure form →Vendor-disclosure questionnaire derived from this instrument's operative obligations.
- Harm narratives →Documented harms relevant to this instrument's topics, for civil-society advocacy.
- Briefing pack →Journalist-ready summary with quotes + dates + primary-source links.
Article tools — track changes, suggest an edit
View history — every captured revision of this article · What links here
Further reading
198 academic & grey-literature sources on the topics this instrument addresses (not commentary on the instrument itself) — catalogued metadata with a primary link; one-line findings are ✦ AI-generated summaries, labeled as such (charter §7.9). Browse the full literature index.
- Machines of justice: A systematic review of AI applications in policing and criminal justice Peer-reviewed✦ AISynthesises a decade of AI-in-criminal-justice research, flagging "algorithmic bias, opacity, and due process" and recommending safeguards for equity and accountability.
- Missing the Mark: Adoption of Watermarking for Generative AI Systems in Practice and Implications Under the New EU AI Act Peer-reviewed✦ AIEmpirical audit finds only 38% of AI image generators implement adequate watermarking and 18% deepfake labelling, exposing a compliance gap under EU AI Act Article 50.
- Current state of Food and Drug Administration-approved artificial intelligence/machine learning medical devices: pathways, transparency, and evidence gaps Peer-reviewed✦ AIDocuments that most FDA AI/ML devices clear via the 510(k) pathway with limited clinical validation and poor transparency, exposing regulatory evidence gaps.
- Open Foundation Models and TDM Exceptions to Copyright – Building Blocks for an AI Ecosystem Peer-reviewed✦ AIArgues Art. 3 CDSM Directive's scientific-research TDM exception 'does not grant rightsholders any control' and can be a 'safe harbor' for training openly released foundation models without licensing data.
- Geopolitical ecologies of cloud capitalism: Territorial restructuring and the making of national computing power in the U.S. and China Peer-reviewed✦ AIUS and Chinese drives for sovereign AI/cloud dominance depend on reorganizing land, energy and regulatory systems to sustain large-scale national computing power.
- European ambitions captured by American clouds: digital sovereignty through Gaia-X? Peer-reviewed✦ AIShows Gaia-X paradoxically incorporates dominant US cloud providers, undermining the very European digital sovereignty it was meant to advance.
- Predictive policing and predictive justice: Ethics, data protection, and the AI act Peer-reviewed✦ AIExamines how predictive-policing and predictive-justice systems interact with data-protection law and the AI Act's law-enforcement provisions, exposing accountability and oversight shortfalls.
- AI, Climate, and Regulation: From Data Centers to the AI Act Peer-reviewed✦ AIAnalyses the legal levers (AI Act energy-reporting duties, Energy Efficiency Directive data-centre KPIs, sustainability reporting) for governing AI's climate footprint and their disclosure gaps.
- National Security and New Forms of Surveillance: From the Data Retention Saga to a Data Subject Centred Approach Peer-reviewed✦ AIArgues the CJEU's controller-based route for applying EU law to national-security surveillance 'creates significant legal uncertainties,' proposing a data-subject-focused scope instead.
- Cop out: security exemptions in the Artificial Intelligence Act (in: Automating Authority — AI in European police and border regimes) Civil society✦ AIDocuments how AI Act security exemptions plus police powers to restrict supervisory information-sharing will make meaningful supervision of policing and migration AI 'extremely difficult.'
- The Current Landscape of Deepfake Legislation in the United States Peer-reviewed✦ AIThematic analysis of 319 state deepfake bills (2019-2024) finds a fragmented patchwork concentrated on political and sexually-explicit content.
- Reimagining U.S. Tort Law for Deepfake Harms: Comparative Insights from China and Singapore Peer-reviewed✦ AIArgues fragmented US tort doctrines (defamation, publicity, IIED) are ill-suited to deepfake harms and draws remedial lessons from Chinese and Singaporean law.
+ 186 more across this instrument's topics — see the literature index.
References
The primary instrument sources behind the article's classifications.
- Legge 23 settembre 2025, n. 132, «Disposizioni e deleghe al Governo in materia di intelligenza artificiale», pubblicata nella Gazzetta Ufficiale della Repubblica Italiana, Serie Generale n. 223 del 25 settembre 2025 (codice redazionale 25G00143); in vigore dal 10 ottobre 2025.
- Art. 26(1)(c) inserts new Criminal Code Art. 612-quater: illicit dissemination of AI-generated or altered images/video/voices, without consent, apt to deceive and causing unjust harm — 1 to 5 years' imprisonment (querela-based; ex officio in aggravated cases).
- Art. 11 — workplace AI must be safe, reliable, transparent, non-discriminatory and not contrary to human dignity; employer must inform the worker of AI use (per Art. 1-bis D.Lgs. 152/1997). Art. 12 establishes a national Observatory on workplace AI.
- Art. 7 — AI must not condition access to healthcare on discriminatory criteria (¶2); patient right to be informed of AI use (¶3); the therapeutic decision is always reserved to the physician (¶5). Arts. 8–10 add research, data-processing and electronic-health-record provisions.
- Art. 15 — in judicial use of AI, decisions on legal interpretation/application, evaluation of facts and evidence, and adoption of measures are always reserved to the magistrate; AI limited to organisational/administrative support. Art. 24(2)(h) delegates a future regime for AI in policing.
- No operative schooling regime in force. Art. 24(2)(g) directs (as a delegation criterion) strengthening STEM/artistic competencies in school curricula; Art. 24(2)(i) requires AI-literacy training in universities/AFAM/ITS; Art. 15(4) promotes AI training for magistrates; Art. 22 supports youth.
- Multiple operative disclosure duties: Art. 4(3) clear-language information on AI data processing + right to object; Art. 7(3) patient information; Art. 11(2) worker notification; Art. 13(2) professional's duty to disclose AI use to the client.
- No general right to contest AI decisions. Art. 4(3) gives a right to object to authorised processing of one's personal data; Art. 16(3)(b) delegates the Government to provide compensatory/injunctive remedies and sanctions for training-data violations; the deepfake offence (Art. 612-quater) is prosecuted on the victim's complaint.
- Art. 25 (new Art. 70-septies l. 633/1941) permits text-and-data-mining reproductions/extractions for AI training from lawfully accessible material (per Arts. 70-ter/70-quater); Art. 16 delegates the Government to enact an organic regime on data, algorithms and mathematical methods for training AI.
- No explicit sovereign-model/sovereign-compute mandate. Supported indirectly by Art. 5 (technological sovereignty + national-data-centre preference), Art. 19 (biennial national AI strategy, dual-use coordination with the Ministry of Defence) and Art. 23 (state investment in AI, cybersecurity and quantum computing).
- Art. 5 — the State must promote AI to raise national competitiveness and the 'technological sovereignty of the Nation' (¶1(a)) and may steer public e-procurement to favour solutions localising strategic data and disaster-recovery/business-continuity in national data centres (¶1(d)).
- Art. 1(2)/Art. 2 align the law with EU Reg. 2024/1689; Art. 19(3) requires the national strategy to take account of international humanitarian law; Art. 20(2) designates ACN as the single contact point with EU institutions under AI-Act Art. 70.
- No standalone watermarking/provenance-marking duty in the law itself; provenance is reached only indirectly — Art. 612-quater criminalises deceptive AI-altered media (turning on whether content is apt to deceive as to genuineness) and the general transparency principle (Art. 4). Content-marking duties are left to the EU AI Act (Art. 1(2)).
- Art. 3(1) lists 'sostenibilità' (sustainability) among the binding general principles governing AI development and use, alongside transparency, proportionality, security and non-discrimination. No operative environmental-reporting or training-footprint duty.
- Art. 6 — activities for national-security purposes by the intelligence services, ACN cybersecurity/resilience, national-defence by the Armed Forces, and certain national-security policing are excluded from the law's scope (subject to fundamental-rights respect; further rules by regulation under l. 124/2007 art. 43).
- Art. 12 establishes a national Observatory on the adoption of AI in the workplace charged with study, monitoring and technical support on the occupational, organisational and training effects of AI; Art. 11(1) frames AI as improving working conditions and productivity. Monitoring, not displacement protection.
How to cite this article
Cite this article
8 formats · 1-click copyPersistent identifier: https://policywindow.org/wiki/italy-ai-law-2025 — committed-stable URL with content-versioning via ?asOf= (rollout pending per methodology §7). DOIs via Zenodo are on the roadmap.
Does this instrument’s approach work? — the social-science evidence
Aggregated over the 15 topics this instrument governs: whether each harm is empirically real, and whether the peer-reviewed evidence shows governance reduces it. The badge is the epistemic status of the evidence— “thin”/“absent” efficacy evidence is itself a finding (the “second silence”). Each epistemic-status label is Policy Window's editorial assessment of the cited evidence base (a structured classification), not a verdict any single source issues.
Of the 15 governed topics with a social-science evidence review, evidence that governance reduces the harm is established for 0, contested for 0, thin for 5, and absent for 10 — for most, no replicated study yet shows this instrument's approach works (the "second silence").
AI-Driven Worker Displacement
AI-driven labour displacement is demonstrably real but localized rather than economy-wide as of 2025-2026. Causal microdata find measurable harm in directly exposed segments: a difference-in-differences study of the Upwork freelance market found that after ChatGPT's release, freelancers in more AI-exposed occupations (e.g. writing) saw ~2% fewer contracts and ~5% lower monthly earnings, with larger losses among previously high-skilled workers (Hui, Reshef & Zhou 2024). Effects concentrate in entry-level and highly-automatable roles while aggregate US employment and wages show little disruption through 2024-2025 — so macro-level harm remains genuinely contested even as targeted-segment harm is established; much deployment to date augments rather than substitutes, raising novice productivity ~34% in call-center work (Brynjolfsson, Li & Raymond 2025).
Sources: Hui, Reshef & Zhou 2024 ('The Short-Term Effects of Generative AI on Employment', Organization Science); Brynjolfsson, Li & Raymond 2025 ('Generative AI at Work', Quarterly Journal of Economics 140(2):889); Acemoglu 2024 ('The Simple Macroeconomics of AI', NBER WP 32487); Autor 2024 ('Applying AI to Rebuild Middle Class Jobs', NBER WP 32140)
There are essentially no impact evaluations of governance specifically targeting AI-driven displacement; current responses (OECD/GPAI guidance, reskilling initiatives, safety-net proposals) are at the recommendation stage, so 'does AI-displacement policy work' is answered only by extrapolation from the broader displaced-worker literature. That analogue base is robust but shows modest, mixed results: Card, Kluve & Weber's (2018) meta-analysis of 200+ active-labour-market evaluations finds training has small/insignificant short-run effects that improve only over the medium-to-long run, US Trade Adjustment Assistance evaluations find largely neutral-to-negative earnings effects (Schochet et al. 2012), and the JTPA randomized evaluation found weak earnings effects for the dislocated-worker stream. Recent syntheses note retraining yields smaller gains precisely when workers move into high-AI-exposure occupations — so the evidence that standard tools reduce AI-displacement harm is thin and early.
Sources: Card, Kluve & Weber 2018 ('What Works? A Meta-Analysis of ... Active Labor Market Program Evaluations', JEEA 16(3):894); Schochet et al. 2012 (Trade Adjustment Assistance Program impacts, Mathematica/USDOL); Bloom et al. 1997 (National JTPA Study, Journal of Human Resources); Brookings 2025 ('AI Labor Displacement and the Limits of Worker Retraining'); OECD 2023-2025 Employment Outlook
AI in Criminal Justice
Whether algorithmic risk assessment reproduces racial disparity is a genuine, partly mathematically irreducible dispute rather than merely an unresolved measurement question. ProPublica's analysis of COMPAS in Broward County found Black defendants who did not reoffend were nearly twice as likely to be flagged high-risk as comparable white defendants (44.9% vs 23.5% false-positive rate; Angwin et al. 2016), and Dressel & Farid (2018) showed COMPAS is no more accurate (65.2%) than untrained laypeople (67.0%); the developer's reanalysis (Flores, Bechtel & Lowenkamp 2016) found the same tool satisfies predictive parity and calibration across race. Honest caveat: Chouldechova (2017) proved both sides can be correct simultaneously — when recidivism base rates differ across groups, equal calibration and equal error rates cannot both hold, so the disagreement is partly definitional, not merely a data dispute to be settled.
Sources: Angwin, Larson, Mattu & Kirchner 2016 (ProPublica, 'Machine Bias'); Dressel & Farid 2018 (Science Advances 4:eaao5580); Flores, Bechtel & Lowenkamp 2016 (Federal Probation 80(2):38); Chouldechova 2017 (Big Data 5(2):153)
Rigorous evidence that governing criminal-justice algorithms — mandating, auditing, or adopting risk tools — reduces the racial-disparity harm that motivates the rules is essentially absent. The leading real-world impact evaluation, Stevenson's (2018) study of Kentucky's mandatory pretrial risk-assessment law (>1M cases), found only a small increase in pretrial release that eroded as judges reverted to prior habits, with no reduction in racial disparities in pretrial detention. The closest analogue evaluations measure operational crime outcomes, not equity, and are largely null: Chicago's Strategic Subjects List had no effect on victimization (Saunders, Hunt & Hollywood 2016) and the only randomized predictive-policing trials tested crime reduction, not disparate impact (Mohler et al. 2015) — so the evidence that any governance regime measurably reduces algorithmic racial disparity is itself missing.
Sources: Stevenson 2018 (Minnesota Law Review 103:303); Saunders, Hunt & Hollywood 2016 (Journal of Experimental Criminology 12(3):347); Mohler et al. 2015 (JASA 110(512):1399)
Deepfakes / Synthetic Content
The flagship harm — non-consensual sexual deepfakes — is empirically real and sharply gendered: content audits find ~96-98% of deepfake videos online are non-consensual pornography overwhelmingly depicting women, and a pre-registered 10-country survey (>16,000 people) found 2.2% reporting victimization and 1.8% perpetration of synthetic intimate imagery, with documented mental-health, career, and participation harms. By contrast, the parallel claim that political/informational deepfakes UNIQUELY deceive is contested-to-refuted: experiments find deepfakes about as (not more) credible than equivalent text/audio fakes, and a 56-paper meta-analysis (k=137, N=86,155) puts unaided human detection near chance — implying a detection problem more than an exceptional-persuasion one.
Sources: Umbach, Henry, Beard & Berryessa 2024 (CHI '24, 'Non-Consensual Synthetic Intimate Imagery ... in 10 Countries'); Diel et al. 2024 (Computers in Human Behavior Reports 16:100538, deepfake-detection meta-analysis of 56 papers); Barari, Lucas & Munger 2025 (Journal of Politics 87(2), 'Political Deepfakes Are as Credible as Other Fake Media'); Flynn et al. 2022 (British Journal of Criminology, multi-country image-based sexual abuse study)
Direct impact evidence that deepfake governance reduces the targeted harm is sparse and, where it exists, discouraging: the one quasi-experimental evaluation (Cuevas & Horta Ribeiro 2025, synthetic-control across three platforms) found the U.S. TAKE IT DOWN Act's passage plus the MrDeepfakes shutdown did NOT suppress synthetic non-consensual imagery — posting rose above counterfactual baselines and displaced elsewhere. Technical enforcement is likewise unreliable: detectors fail to generalize to unseen generators (notably diffusion models) and are vulnerable to adversarial evasion, with in-the-wild accuracy well below benchmark figures. No rigorous evaluation yet shows a deepfake-specific law, takedown mandate, or watermarking scheme producing a sustained reduction in prevalence or harm.
Sources: Cuevas & Horta Ribeiro 2025 ('Deepfake Pornography is Resilient to Regulatory and Platform Shocks', arXiv:2602.02754); 'Adversarial Reality for Evading Deepfake Image Detectors' (ICCVW 2025); TAKE IT DOWN Act, S.146 / Pub. L. 119-12 (2025); CRS Legal Sidebar LSB11314
AI in Education
The documented harms of educational AI are empirically real and, for proctoring, replicated: a controlled audit of a proctoring tool used by at least ~1,500 institutions found significantly higher facial-detection failure (the trigger for 'suspicious' flags) for darker-skinned and female test-takers (Yoder-Himes et al. 2022), and a technical audit of 164 government-endorsed pandemic learning products found 89% engaged in data practices that risk or infringe children's rights, with most monitoring happening without the child's knowledge or consent (Human Rights Watch 2022). Honest caveat: the benefit side is genuine but highly sensitive to how outcomes are measured rather than uniform — Kulik & Fletcher's meta-analysis of 50 intelligent-tutoring evaluations found an overall median effect of 0.66 SD, but the average effect was 0.73 SD on locally-developed tests versus only 0.13 SD on standardized tests, so much of AI education's apparent value depends on the outcome measure used.
Sources: Yoder-Himes et al. 2022, 'Racial, skin tone, and sex disparities in automated proctoring software', Frontiers in Education 7:881449; Human Rights Watch 2022, 'How Dare They Peep into My Private Life?' (164 EdTech products endorsed by 49 governments; 89% risked/infringed children's rights); Kulik & Fletcher 2016, 'Effectiveness of Intelligent Tutoring Systems: A Meta-Analytic Review', Review of Educational Research 86(1):42-78
There are essentially no rigorous impact evaluations showing that purpose-built governance of educational AI reduces the documented harms. The student-specific regime — California's SOPIPA (SB 1177, 2014, a model that more than 20 states adopted and ~33 considered) and the FTC's May 2022 COPPA ed-tech policy statement (which the agency itself said did not change existing requirements) — has near-zero documented enforcement and no published before/after evaluation of whether it changed vendor data practices or bias outcomes. The only documented remedies came not from education-specific rules but from generic legal levers: a $6.25M biometric-privacy class settlement under Illinois BIPA (Veiga v. Respondus, 2023) and a constitutional ruling that proctoring room-scans are an unreasonable search (Ogletree v. Cleveland State University, N.D. Ohio 2022, Calabrese J.) — neither of which is a replicable evaluation, and both reach private/state actors rather than the underlying demographic-bias harm.
Sources: California SOPIPA (SB 1177, 2014); FTC Policy Statement on Education Technology and COPPA (adopted May 19, 2022); Veiga v. Respondus, Inc. ($6.25M BIPA class settlement, 2023; covers Illinois Respondus Monitor users Nov. 2015–June 2023); Ogletree v. Cleveland State University (N.D. Ohio 2022, Calabrese J., room-scan Fourth Amendment ruling)
AI in Employment
Discrimination and adverse outcomes in employment decisions are empirically well-established, and AI systems demonstrably reproduce them. The foundational field-experiment literature shows robust human baseline discrimination (Bertrand & Mullainathan 2004 found White-sounding names received 50% more callbacks), and AI-specific audits confirm the pattern: Amazon scrapped a recruiting tool that penalized resumes containing 'women's' (Dastin 2018), and a controlled resume-screening audit of language-model retrieval found systems favored White-associated names ~85% of the time and never preferred Black male-associated over White male-associated names (Wilson & Caliskan 2024). On the monitoring side, a meta-analysis (k=94, N≈23,461) found electronic performance monitoring reliably raises worker stress with no evidence of improved performance (Ravid et al. 2023). Honest caveat: measured disparities are highly model-, prompt-, and context-dependent, and most evidence comes from controlled audits and one firm's internal test rather than measured outcomes in live, at-scale hiring pipelines.
Sources: Bertrand & Mullainathan 2004 (American Economic Review 94(4):991-1013); Wilson & Caliskan 2024 (AAAI/ACM AIES; 'Gender, Race, and Intersectional Bias in Resume Screening via Language Model Retrieval'); Dastin 2018 (Reuters, 'Amazon scraps secret AI recruiting tool that showed bias against women'); Ravid, White, Tomczak & Behrend 2023 (Personnel Psychology 76:5-40)
There is no rigorous evidence that governing AI in employment reduces the documented harms; the central evaluated regime appears to fail at the compliance stage before any impact on bias can occur. NYC Local Law 144 — the first jurisdiction worldwide to mandate independent bias audits and public posting for automated employment decision tools — was directly studied across 391 employers and found to produce 'null compliance': the law's discretion makes it impossible to tell whether firms comply, with very few posting the required audits (Wright et al. 2024). Parallel qualitative work shows the audits themselves are undermined by missing demographic data, opaque aggregation, and 'test data' that does not reflect real use (Groves et al. 2024). No study links any AI-employment rule to a measured reduction in discriminatory hiring outcomes — the evidence that the rule works is itself missing, largely because mandated transparency artifacts (audit reports) are sparse, non-standardized, and unenforced.
Sources: Wright, Muenster, Vecchione, Metcalf & Matias et al. 2024 ('Null Compliance: NYC Local Law 144 and the Challenges of Algorithm Accountability', ACM FAccT '24); Groves, Metcalf, Kennedy, Vecchione & Strait 2024 ('Auditing Work: Exploring the New York City algorithmic bias audit regime', ACM FAccT '24); Ravid, White, Tomczak & Behrend 2023 (Personnel Psychology 76:5-40, on monitoring outcomes as the closest analogue evaluation evidence)
Environmental Impact of AI Training
The resource demands of AI compute are empirically documented at the model level: Strubell et al. (2019) quantified large-NLP training energy/carbon, Luccioni et al. (2023) estimated BLOOM's training at ~24.7 tCO2eq (dynamic power) rising to ~50.5 tCO2eq with manufacturing and deployment, Li et al. (2023) estimated GPT-3-scale training in US datacenters can evaporate on the order of hundreds of thousands of litres of freshwater (their central figure ~700,000 L), and Luccioni, Jernite & Strubell (2024) showed generative inference is markedly more energy-intensive per query than task-specific models; at the macro scale the IEA (2024) and de Vries (2023) document rapidly rising datacenter electricity demand. Honest caveat: absolute estimates vary by up to orders of magnitude with grid carbon intensity, hardware, utilisation and accounting boundaries, and cleanly attributing the AI-specific increment (versus general datacenter and crypto growth) remains genuinely contested — the IEA itself bundles AI with datacenters and crypto — so the existence of the footprint is established while its magnitude and trajectory are not.
Sources: Strubell, Ganesh & McCallum 2019 (ACL Anthology P19-1355; 'Energy and Policy Considerations for Deep Learning in NLP'); Luccioni, Viguier & Ligozat 2023 (JMLR 24; BLOOM 176B carbon footprint, 24.7/50.5 tCO2eq; arXiv:2211.02001); Li, Yang, Islam & Ren 2023 (arXiv:2304.03271, 'Making AI Less Thirsty', later Comm. ACM 2025); Luccioni, Jernite & Strubell 2024 (ACM FAccT '24, 'Power Hungry Processing', DOI 10.1145/3630106.3658542); de Vries 2023 (Joule 7(10):2191-2194, DOI 10.1016/j.joule.2023.09.004); IEA 2024 (Electricity 2024)
There is no impact evaluation showing that any AI-specific environmental-governance instrument reduces energy, water or carbon use, because every named instrument is voluntary or non-binding and very recent: EU AI Act Art. 95 codes of conduct are explicitly optional with no sanctions, and NIST AI 600-1 and the G7 Hiroshima Code are guidance, not enforceable caps. The closest analogue evaluation literature is divided in a way that disfavours the voluntary form chosen here: rigorous reviews find voluntary environmental programs generally fail to produce significant abatement beyond business-as-usual (Koehler 2007; Morgenstern & Pizer 2007), whereas the one form with credible positive evidence is mandatory disclosure (Downar et al. 2021 found a UK carbon-reporting mandate cut emissions ~8% versus a control group) which the AI instruments do not yet impose, leaving the proposition that AI environmental governance works essentially untested.
Sources: EU AI Act Art. 95 / Recital 142 (Reg. (EU) 2024/1689); NIST AI 600-1 (2024, GenAI Profile); G7 Hiroshima Process International Code of Conduct (30 Oct 2023); Koehler 2007 (Policy Studies Journal 35(4):689-722); Morgenstern & Pizer (eds.) 2007 (Reality Check, RFF Press); Downar, Ernstberger, Reichelstein, Schwenen & Zaklan 2021 (Review of Accounting Studies 26(3):1137-1175)
AI in Healthcare
Both the benefit and the harm of clinical AI are empirically real and well-documented, but outcomes are highly deployment-dependent. Rigorous prospective studies show genuine clinical value in narrow tasks — the MASAI RCT (>100,000 women) found AI-supported mammography detected ~20% more cancers (6.1 vs 5.1 per 1000 screened) at comparable recall rates (Lang et al. 2023, Lancet Oncology), and IDx-DR's pivotal trial achieved 87.2% sensitivity / 90.7% specificity for diabetic retinopathy (Abramoff et al. 2018, npj Digital Medicine) — yet widely deployed models can fail or harm: the Epic Sepsis Model, live at hundreds of US hospitals, scored AUC 0.63 with 33% sensitivity on external validation (Wong et al. 2021, JAMA Internal Medicine), and a population-health algorithm covering ~200M people understated Black patients' illness because it predicted cost not need (Obermeyer et al. 2019, Science). Honest caveat: there is no single 'AI in healthcare' effect — performance ranges from life-saving to dangerous depending on task, calibration, and whether the model was prospectively validated.
Sources: Lang K, Josefsson V, Larsson A-M, et al. 2023 (Lancet Oncology 24(8):936-944, MASAI trial clinical safety analysis; AI-supported screening detected 6.1 vs 5.1 cancers per 1000, ~20% higher, similar recall rates); Abramoff MD, Lavin PT, Birch M, Shah N, Folk JC. 2018 (npj Digital Medicine 1:39, IDx-DR pivotal trial; 87.2% sensitivity / 90.7% specificity); Wong A, Otles E, Donnelly JP, et al. 2021 (JAMA Internal Medicine 181(8):1065-1070, Epic Sepsis Model external validation; AUC 0.63, 33% sensitivity); Obermeyer Z, Powers B, Vogeli C, Mullainathan S. 2019 (Science 366(6464):447-453, racial bias from cost-as-proxy)
There is essentially no impact-evaluation evidence that the prevailing governance regime for medical AI — FDA authorization, predominantly via the 510(k) substantial-equivalence pathway — measurably reduces patient harm or improves outcomes. Analyses of authorized AI devices find that clinical validation is frequently absent or non-prospective (of 521 FDA-authorized AI devices, ~43% had no published clinical-validation data and only ~28% were prospectively validated; Chouffani El Fassi & Henderson et al. 2024) and that demographic performance is almost never reported (race/ethnicity in 3.6%, and only 9.0% of 692 510(k)/cleared AI devices carried a prospective post-market-surveillance study; Muralidharan et al. 2024). Earlier analysis of 130 cleared devices likewise found 97% were evaluated only retrospectively (Wu et al. 2021). The closest analogue evidence on the pathway itself is discouraging: the Institute of Medicine (2011) concluded the 510(k) process was not designed to assess safety and effectiveness — i.e., no direct study establishes that the rule, as written, prevents the harms it targets. Caveat: this is an absence of impact evaluation plus reporting-gap and design-critique evidence, not a study showing the regime fails to reduce harm.
Sources: Chouffani El Fassi S, Abdullah A, Fang Y, ... Henderson GE, et al. 2024 (Nature Medicine, 'Not all AI health tools with regulatory authorization are clinically validated', s41591-024-03203-3; 521 devices, ~43% no clinical validation, ~28% prospectively validated); Muralidharan V, Adewale BA, Huang CJ, et al. 2024 (npj Digital Medicine 7:273, scoping review of reporting gaps in 692 FDA-approved AI medical devices; race/ethnicity 3.6%, prospective post-market surveillance 9.0%); Wu E, Wu K, Daneshjou R, Ouyang D, Ho DE, Zou J. 2021 (Nature Medicine 27:582-584, analysis of 130 FDA approvals; 97% retrospective-only evaluation); Institute of Medicine 2011 (Medical Devices and the Public's Health: The FDA 510(k) Clearance Process at 35 Years)
International Coordination
The DESCRIPTIVE premise is well-established: IR scholarship now treats global AI governance as a fragmented 'regime complex' of partially overlapping G7/G20/OECD/GPAI/UN/standards-body arrangements with no central hierarchy (Tallberg et al. 2023 — verified verbatim: 'the emerging governance architecture for AI can be described as a regime complex'; Cihon, Maas & Kemp 2020). But the implied HARM — that forum-shopping and regulatory arbitrage cause a measurable race-to-the-bottom or relocate AI development to lax jurisdictions — is largely theorized/anticipated rather than empirically demonstrated for AI; Tallberg et al. explicitly flag forum-shopping as a dynamic whose presence in the AI regime complex is an open empirical question ('Establishing whether these patterns and dynamics are key features also of the AI regime complex stand out as important priorities in future research'). Honest caveat: the strongest empirical arbitrage evidence comes from analogue footloose digital markets (e.g., ICO reallocation after US securities enforcement) — itself a mixed/contested literature — not from AI firms, so the magnitude of coordination-failure harm in AI specifically remains contested and under-measured.
Sources: Tallberg, Erman, Furendal, Geith, Klamberg & Lundgren 2023 (International Studies Review 25(3): viad040); Cihon, Maas & Kemp 2020 (Should AI Governance be Centralised?, AIES '20: 228-234); Lancieri, Edelson & Bechtold 2025 (AI Regulation: Competition, Arbitrage & Regulatory Capture, Theoretical Inquiries in Law 26(1): 239-262)
There are essentially no impact evaluations showing that the negotiated-coordination mode (AI Safety Institute network MoUs, forum-shifting, multilateral declarations) actually produces regulatory convergence or reduces arbitrage — the AISI Network began only as a statement of intent at the Seoul Summit (Seoul Statement of Intent, 21 May 2024) and held its first operational meeting in November 2024, with no defined metrics or outcome studies, so these soft-law instruments are too new to have measurable effects. The closest analogue evidence is mixed and works through DIFFERENT mechanisms than this topic describes: Bradford's Brussels Effect documents de-facto convergence driven by market access rather than negotiated coordination, and the FATF transgovernmental-network literature shows peer-review mutual evaluation can drive AML convergence — but neither evaluates voluntary AI MoU networks, and FATF's effects come with well-documented unintended consequences (de-risking, financial exclusion). The plain finding: the evidence that AI-governance coordination 'works' is itself missing.
Sources: Bradford 2020 (The Brussels Effect: How the European Union Rules the World, Oxford University Press); Nance 2018 (The regime that FATF built: an introduction to the Financial Action Task Force, Crime, Law and Social Change 69(2): 109-129; cf. Slaughter 2004, A New World Order, Princeton University Press); International Network of AI Safety Institutes — Seoul Statement of Intent toward International Cooperation on AI Safety Science (21 May 2024; network's first meeting San Francisco, Nov 2024)
National Security Carveouts in AI Regulation
That civilian AI-governance instruments carve out national-security uses is black-letter and undisputed (EU AIA Art. 2(3); CoE Framework Convention Art. 3(2) on national-security activities, distinct from Art. 3(4) on national defence; US NSM-25 (Oct. 2024) as the national-security-track instrument fulfilling §4.8 of EO 14110); civil-society legal analysis argues a blanket exclusion is harder to square with a necessity-and-proportionality approach than a qualified one (Korff/ECNL 2022; Vogiatzoglou 2024). But whether the carveout itself produces concrete unredressed harm is empirically under-observed almost by construction — the secrecy it confers suppresses the very evidence needed to measure it. The closest analogue, national-security deference in the courts, shows the mechanism is real (the FISC granted all but eleven of 33,900 applications 1979-2012, a 99.97% approval rate; Sinnar 2022 documents downstream harms to securitized communities), yet Clarke (2014) shows that lopsided ex parte approval rates alone do not prove rubber-stamping, because rational case selection and pre-vetting produce similar rates in ordinary Title III wiretaps (99.93%) and delayed-notice warrants (99.6-99.8%) — so the magnitude of harm attributable to the carveout, as opposed to the legitimate secrecy of the domain, remains genuinely contested.
Sources: Korff 2022 (ECNL Opinion on the implications of the exclusion of national security from AI legislation, Oct. 2022); Sinnar 2022 (Harvard Law Review Forum 136:59, 'A Label Covering a "Multitude of Sins": The Harm of National Security Deference'); Clarke 2014 (Stanford Law Review Online 66:125, 'Is the Foreign Intelligence Surveillance Court Really a Rubber Stamp?'); EPIC FISC statistics 1979-2012
There is no impact evaluation showing that any specific design of the national-security carveout — categorical exclusion versus parallel governance track versus civilian-compliance-with-override — measurably improves oversight or reduces harm relative to the alternatives; the question is argued doctrinally (Vogiatzoglou 2024; Korff/ECNL 2022) but has never been tested empirically. The closest analogue evaluation literature is on the parallel-track model already in use for intelligence surveillance (the FISC / FISA oversight regime), and even there the evidence that the mechanism delivers effective scrutiny is itself contested rather than established (Clarke 2014; Sinnar 2022). No direct evaluation exists because the carveouts are recent (EU AIA 2024, CoE Framework Convention 2024, US NSM-25 2024), enforcement actions are by design non-public, and private parties typically lack standing to challenge a specific exempt deployment — the structural features that make the harm hard to observe also make the governance impossible to evaluate.
Sources: Vogiatzoglou 2024 (Verfassungsblog, 'The AI Act National Security Exception: room for manoeuvres?', 9 Dec. 2024); Korff 2022 (ECNL Opinion, exclusion of national security from AI legislation); Clarke 2014 (Stanford Law Review Online 66:125); Sinnar 2022 (Harvard Law Review Forum 136:59)
Individual Redress
The premise behind redress — that affected people lack meaningful recourse against automated decisions — is real, but the flagship instrument is weaker than commonly assumed. Wachter, Mittelstadt & Floridi (2017) show GDPR creates only a limited 'right to be informed,' not a binding 'right to explanation' of specific decisions; and controlled work finds the explanations actually delivered do not measurably improve lay decision accuracy over showing the bare AI prediction (Alufaisan et al. 2021; and a 2022 meta-analysis by Schemmer et al. — screening 393 articles down to 9 in the final analysis — reports 'no effect of explanations on users' performance compared to sole AI predictions,' even though XAI overall had a positive effect). Honest caveat: the legitimacy/dignity value of being heard is empirically well established in the procedural-justice tradition even where outcome accuracy is unchanged, so 'redress fails' depends on which aim is measured.
Sources: Wachter, Mittelstadt & Floridi 2017 (International Data Privacy Law 7(2):76); Alufaisan, Marusich, Bakdash, Zhou & Kantarcioglu 2021 (Proceedings of the AAAI Conference on AI 35(8):6618); Schemmer, Hemmer, Nitsche, Kühl & Vössing 2022 (AAAI/ACM AIES '22, meta-analysis)
There is no rigorous impact evaluation showing that mandated redress mechanisms (right-to-explanation, appeal, human-in-the-loop review) actually reduce erroneous or unfair automated decisions — the evidence that the rule works is itself missing. The closest experimental analogues are discouraging: explanations increase humans' acceptance of AI recommendations regardless of correctness (Bansal et al. 2021), and algorithm-in-the-loop oversight can introduce racial disparities and exhibit automation bias rather than reliably catching model errors (Green & Chen 2019). The procedural-justice literature (Tyler 1990; Lind & Tyler 1988) robustly supports a legitimacy and compliance benefit of fair process, but it measures perceived fairness, not reduction of the substantive decision harm redress is meant to cure.
Sources: Bansal, Wu, Zhou, Fok, Nushi, Kamar, Ribeiro & Weld 2021 (CHI '21); Green & Chen 2019 (Disparate Interactions, ACM FAT* '19); Tyler 1990 (Why People Obey the Law, Yale Univ. Press); Lind & Tyler 1988 (The Social Psychology of Procedural Justice, Plenum Press)
Sovereign AI Doctrine
Sovereign-AI doctrine is post-2023 and largely aspirational, so its core empirical premise — that frontier model deployment can be meaningfully bound to a national jurisdiction — is only just beginning to be tested. What IS measurable is the underlying compute geography the doctrine reacts to: an audit of 775 non-U.S. data-center projects estimates U.S. companies operate ~48% of them when weighted by investment value (a proxy for compute capacity, and explicitly an initial public-data approximation), implying 'in-territory' hardware is frequently still subject to foreign corporate/legal control (Richardson et al. 2025). Honest caveat: there is no peer-reviewed evidence base establishing whether jurisdiction-bound frontier deployment is technically feasible at scale — the descriptive dependency (foreign operation of locally-sited hardware) is documented, but the doctrine's central feasibility claim is thin and early.
Sources: Richardson et al. 2025 (arXiv:2508.00932, 'How Sovereign Is Sovereign Compute? A Review of 775 Non-U.S. Data Centers'); Gupta, Walker & Reddie 2024 (arXiv:2411.14425, 'Whack-a-Chip: The Futility of Hardware-Centric Export Controls', UC Berkeley Risk & Security Lab)
There is no rigorous impact evaluation showing that sovereign-AI governance achieves its stated aim of secure, contained national AI capability. The closest direct levers have measurable but mostly adverse or contested evidence: ex-ante simulations of the closest analogue — data-localization mandates — project GDP losses (EU GDP −0.4% under proposed/GDPR-style measures rising to −1.1% under economy-wide localization; Bauer, Lee-Makiyama, van der Marel & Verschelde 2014, ECIPE Occasional Paper No. 3/2014) yet quantify no realized sovereignty benefit, and chip export controls — the other main instrument — show contested efficacy: one cross-firm study finds no innovation harm to 30 leading semiconductor firms (Schumacher 2024, CSIS) while case evidence documents systematic circumvention via software/efficiency gains and chip exfiltration/smuggling (Gupta, Walker & Reddie 2024). No replicated study demonstrates that any sovereign-AI regime measurably delivers the jurisdictional control it asserts.
Sources: Bauer, Lee-Makiyama, van der Marel & Verschelde 2014 (ECIPE Occasional Paper No. 3/2014, 'The Costs of Data Localisation: Friendly Fire on Economic Recovery'); Schumacher 2024 (CSIS, 'Did U.S. Semiconductor Export Controls Harm Innovation?'); Gupta, Walker & Reddie 2024 (arXiv:2411.14425, 'Whack-a-Chip: The Futility of Hardware-Centric Export Controls')
Synthetic Content Provenance
The harm provenance targets is real but concentrated, and the technical premise that the mandated signal survives is itself empirically shaky. Synthetic-media harm is well documented in two domains: non-consensual intimate imagery (Ajder et al.'s 2019 Deeptrace audit found 96% of deepfake videos were pornographic and effectively 100% targeted women) and impersonation fraud (the Arup case, ~US$25.6M / HK$200M lost via a deepfake video call). The honest caveat is twofold: a feared broad political-misinformation harm is not yet demonstrated at scale, and CS work shows invisible watermarks are removable in practice (Jiang, Zhang & Gong 2023, WEvade, evade detection via adversarial perturbation; Zhao et al. 2024 prove pixel-level watermarks are provably removable via regeneration attacks), so the provenance signal a rule would mandate is itself contested.
Sources: Ajder, Patrini, Cavalli & Cullen 2019 (Deeptrace, 'The State of Deepfakes: Landscape, Threats, and Impact'); Jiang, Zhang & Gong 2023 ('Evading Watermark based Detection of AI-Generated Content', ACM CCS 2023); Zhao et al. 2024 (NeurIPS, 'Invisible Image Watermarks Are Provably Removable Using Generative AI'); Arup deepfake fraud (CNN Business, 2024-05-16, US$25.6M)
There is no impact evaluation showing that mandated provenance/labeling reduces synthetic-media harm; the major mandates (China's GenAI labeling Measures, effective 2025-09-01; EU AIA Art. 50, machine-readable marking) are too new and unevaluated, and the delivery layer is leaky: the C2PA spec's own Security Considerations document the strip-and-repost threat, and platform audits report C2PA/Content-Credentials metadata is stripped by essentially all major social platforms on upload (consistent with Imatag's 2018 finding that ~80% of uploaded images lose metadata, only ~15% retaining it). The closest analogue evaluation literature — Pennycook, Bear, Collins & Rand (2020), the 'implied truth effect' — gives reason for caution rather than confidence: labeling only some content can make unlabeled false content seem more credible, so a partial-coverage provenance regime could backfire.
Sources: Pennycook, Bear, Collins & Rand 2020 (Management Science 66(11):4944-4957, 'The Implied Truth Effect'); China Measures for Labeling AI-Generated Synthetic Content (eff. 2025-09-01); EU AI Act Art. 50; Imatag 2018 metadata-stripping study (~80%); C2PA Security Considerations (spec.c2pa.org) on manifest removal
Technological Sovereignty
The structural fact that compute capacity is geographically concentrated is well-measured: Lehdonvirta, Wú & Hawkins find only ~33 countries host facilities with AI-accelerator hardware and roughly 24 have the capacity to train full-scale foundation models, the Stanford AI Index 2026 reports low-income countries collectively hold ~0.1% of global data-centre compute (the US hosting >10x any other nation), and Cottier et al. document amortized frontier-training cost rising 2.4x/year (95% CI 2.0-3.1x) toward $1B+ models by 2027. But this is a political-economy FRAME, not a documented harm, and the core contested claim of the topic, that the cost curve locks mid-sized economies OUT of capability, is empirically cut both ways: a feasibility study of Brazil and Mexico (Malagon et al. 2025) estimates usable (non-frontier) 10-trillion-token sovereign models are fiscally viable at roughly $8-14M on H100 hardware, and DeepSeek-style efficiency gains (V3 trained for ~$5.5M, ~11x less compute than Llama 3 405B) show frontier-adjacent performance at a fraction of prior compute, so whether domestic frontier-tier capability is foreclosed for middle powers remains genuinely unsettled.
Sources: Lehdonvirta, Wú & Hawkins 2024 (Compute North vs. Compute South, Proceedings of the 2024 AAAI/ACM Conference on AI, Ethics & Society 7:828-838); Cottier, Rahman, Fattorini, Maslej & Owen 2024 (The Rising Costs of Training Frontier AI Models, arXiv:2405.21015); Stanford AI Index 2026 (Maslej et al., Stanford HAI); Malagon, Ulloa Ruiz, Sandoval Plaza, Rosario Bolívar, García Mesa & Alvarado Morales 2025 (The Feasibility of Training Sovereign Language Models in the Global South: A Study of Brazil and Mexico, arXiv:2510.19801)
There is no rigorous impact evaluation showing that technological-sovereignty policies (on-shore compute mandates, national foundation-model champions, talent-retention schemes such as EuroHPC AI Factories or India's IndiaAI Mission) actually deliver sustained domestic capability or strategic autonomy; these programs are recent, utilization and cost-per-GPU-hour are largely unpublished, and no counterfactual study exists. The closest analogue evidence base, the industrial-policy literature synthesized by Juhász, Lane & Rodrik, finds that properly-identified studies are more favorable than older correlational work suggested but that outcomes depend heavily on instrument design and structural context, and the older national-champion record warns of subsidized 'zombie' firms and government capture, so the closest analogue is mixed and the direct evidence that the sovereignty rule works is simply missing.
Sources: Juhász, Lane & Rodrik 2024 (The New Economics of Industrial Policy, Annual Review of Economics 16:213-242); Ahmed & Wahed 2020 (The De-democratization of AI: Deep Learning and the Compute Divide in Artificial Intelligence Research, arXiv:2010.15581); IndiaAI Mission (Indian Cabinet, March 2024); EuroHPC Joint Undertaking AI Factories (2024 regulation amendment; no published impact evaluation)
Training-Data Rights
That foundation models ingest copyrighted and personal works without consent is undisputed; whether that ingestion produces legally cognizable reproduction harm is genuinely contested. The CS evidence that models can memorize and emit verbatim training text is robust and replicated — Carlini et al. (2021) extracted hundreds of verbatim sequences (including PII) from GPT-2, and follow-up work (Carlini et al., Quantifying Memorization, ICLR 2023) showed extraction scales log-linearly with model size and with example duplication. Honest caveat: verbatim reproduction is the exception, not the norm — the UK High Court held that Stable Diffusion's model weights never stored copies of the training images (defeating the secondary-infringement theory), and Getty abandoned its primary training-infringement claim at trial for lack of evidence, so whether the empirical phenomenon amounts to actionable harm (rather than transient, non-expressive use) remains the open question driving NYT v. OpenAI and parallel regimes.
Sources: Carlini, Tramèr, Wallace, Jagielski, Herbert-Voss, Lee, Roberts, Brown, Song, Erlingsson, Oprea & Raffel 2021 (Extracting Training Data from Large Language Models, 30th USENIX Security Symposium); Carlini, Ippolito, Jagielski, Lee, Tramèr & Zhang 2023 (Quantifying Memorization Across Neural Language Models, ICLR 2023; arXiv:2202.07646); Getty Images (US) Inc & ors v Stability AI Ltd [2025] EWHC 2863 (Ch) (UK High Court, 4 Nov 2025 — no secondary infringement; primary training claim abandoned at trial); The New York Times Co. v. Microsoft Corp. & OpenAI (S.D.N.Y., No. 1:23-cv-11195; consolidated In re OpenAI Copyright Infringement Litigation, Apr. 2025; ongoing 2025-2026)
There is no impact evaluation showing that the CDSM Directive Article 4 TDM exception plus its Article 4(3) opt-out reservation regime actually reduces unlicensed ingestion or channels compensation to rightsholders — the evidence that the rule works as designed is itself missing. The only available evidence is early case law and doctrinal scholarship, which document the mechanism's contested operation rather than its success: in Kneschke v. LAION the Hamburg Higher Regional Court (on appeal, 10 Dec 2025) held that a rights reservation in natural language did NOT satisfy Article 4(3)'s machine-readability requirement, invalidating the opt-out (note: the first-instance Regional Court had left the Article 4 question largely open and the case ultimately turned on the Article 3 scientific-research exception, so this machine-readability holding is appellate and not yet settled — a further appeal to the Federal Court of Justice was permitted). Legal scholars characterize the Article 4 opt-out as practically difficult and unharmonized, with no observed market in TDM licences or systematic enforcement to evaluate.
Sources: Kneschke v. LAION (Hamburg Regional Court, 27 Sept 2024, 310 O 227/23; on appeal Hamburg Higher Regional Court, 10 Dec 2025, 5 U 104/24 — opt-out held not machine-readable; further appeal to BGH permitted); Margoni & Kretschmer 2022 (A Deeper Look into the EU Text and Data Mining Exceptions, GRUR International 71(8):685-701); Quintais 2025 (Generative AI, Copyright and the AI Act, Computer Law & Security Review 56:106107)
Transparency Obligations
Documentation artifacts (model cards, datasheets) are well-specified as proposals and are genuinely adopted, but the empirical premise that mandated disclosure produces meaningful transparency is contested. Selbst & Barocas (2018) argue inscrutability and non-intuitiveness are distinct problems and that disclosing rules does not resolve the latter, and large-scale audits find documentation is sparsely and unevenly completed: a systematic analysis of 32,111 Hugging Face model cards (Liang et al. 2024) found environmental-impact, limitations and evaluation sections least often filled, and Bhat et al. (2023, 45 practitioners) found a substantial gap between the documentation proposal and actual practice. Honest caveat: the documentation frameworks themselves are real and adopted, so the dispute is about whether disclosure conveys decision-relevant information, not whether the artifacts exist.
Sources: Selbst & Barocas 2018 (Fordham Law Review 87:1085-1139); Liang et al. 2024 (Nature Machine Intelligence, s42256-024-00857-z, 'Systematic analysis of 32,111 AI model cards'); Bhat et al. 2023 (CHI '23, 'Aspirations and Practice of ML Model Documentation', DOI 10.1145/3544548.3581518); Mitchell et al. 2019 (FAccT, Model Cards for Model Reporting); Gebru et al. 2021 (CACM 64(12):86-92, Datasheets for Datasets)
There is no rigorous impact evaluation showing that AI transparency mandates (model cards, training-data summaries) measurably reduce bias, misuse or accidents — the central regulatory assumption is empirically untested, partly because flagship mandates like EU AI Act Art. 53(1)(d) GPAI training-data summaries are only subject to AI Office enforcement/verification from 2 August 2026 (the obligation itself began 2 August 2025 for new models). The closest analogue, mandated consumer disclosure, shows small and context-dependent effects: Bollinger, Leslie & Sorensen (2011) found mandatory calorie posting cut average calories per transaction by about 6%, while Loewenstein, Sunstein & Golman (2014) review evidence that disclosure effects are frequently diminished or even reversed by limited attention and often change provider rather than recipient behavior. These are analogues, not AI studies; no study demonstrates that AI transparency disclosure achieves its stated downstream safety aims.
Sources: Bollinger, Leslie & Sorensen 2011 (AEJ: Economic Policy 3(1):91-128); Loewenstein, Sunstein & Golman 2014 (Annual Review of Economics 6:391-419, 'Disclosure: Psychology Changes Everything'); EU AI Act Art. 53(1)(d) GPAI training-data summary (obligation from 2 Aug 2025; AI Office enforcement from 2 Aug 2026)