Living evidence map · scoping-review idiom
Directive (EU) 2024/2831 on improving working conditions in platform work
EU-PWD-2024 · EU
In force since 2024-12-01. A Binding regulation from EU. The EU Platform Work Directive ((EU) 2024/2831) was adopted on 23 October 2024, published in the Official Journal on 11 November 2024, and entered into force on 1 December 2024; Member States must transpose it into national law by 2 December 2026. It applies to digital labour platforms organising platform work performed in the Union regardless of where the platform is established. Its two pillars are (1) a rebuttable legal presumption of an employment relationship to correctly determine the employment status of platform workers, and (2) Chapter III rules on algorithmic management that apply to all persons performing platform work, including those without an employment contract. The algorithmic-management provisions restrict processing of certain personal data (Art. 7 prohibits processing of data on emotional or psychological state, private conversations including with worker representatives, biometric data to establish identity by one-to-many comparison against a database other than for authentication, and inference of protected characteristics / prediction of the exercise of fundamental rights or trade-union activity), require a data protection impact assessment (Art. 8), mandate transparency/information to workers and their representatives about automated monitoring and decision-making systems (Art. 9), require human oversight with competent staff able to override automated decisions and a biennial impact evaluation (Art. 10), and require human review and a right to explanation/contestation of significant decisions - including that decisions to restrict, suspend or terminate a person's account or contractual relationship may not be taken solely by automated decision-making systems (Art. 11). The Directive is a labour/data-protection instrument; it is not a general AI law and does not address foundation models, frontier-model compute, or national-security topics. Chapter III article numbering verified (Art. 7 data processing, Art. 8 DPIA, Art. 9 transparency, Art. 10 human oversight, Art. 11 human review) across the official Better Regulation document index, the consolidated EUR-Lex TEXT and analyses by CMS, LexisNexis, CXC, Freshfields and EU-OSHA; the EUR-Lex ELI permalink is the canonical official source and resolves (HTTP 202 anti-bot challenge), though its JS-rendered body could not be machine-extracted via fetch.
Coverage at a glance
Coverage fingerprint — color = verdict, height = confidence. One tick per tracked topic.
Scope and obligations
The EU Platform Work Directive ((EU) 2024/2831) was adopted on 23 October 2024, published in the Official Journal on 11 November 2024, and entered into force on 1 December 2024; Member States must transpose it into national law by 2 December 2026. It applies to digital labour platforms organising platform work performed in the Union regardless of where the platform is established. Its two pillars are (1) a rebuttable legal presumption of an employment relationship to correctly determine the employment status of platform workers, and (2) Chapter III rules on algorithmic management that apply to all persons performing platform work, including those without an employment contract. The algorithmic-management provisions restrict processing of certain personal data (Art. 7 prohibits processing of data on emotional or psychological state, private conversations including with worker representatives, biometric data to establish identity by one-to-many comparison against a database other than for authentication, and inference of protected characteristics / prediction of the exercise of fundamental rights or trade-union activity), require a data protection impact assessment (Art. 8), mandate transparency/information to workers and their representatives about automated monitoring and decision-making systems (Art. 9), require human oversight with competent staff able to override automated decisions and a biennial impact evaluation (Art. 10), and require human review and a right to explanation/contestation of significant decisions - including that decisions to restrict, suspend or terminate a person's account or contractual relationship may not be taken solely by automated decision-making systems (Art. 11). The Directive is a labour/data-protection instrument; it is not a general AI law and does not address foundation models, frontier-model compute, or national-security topics. Chapter III article numbering verified (Art. 7 data processing, Art. 8 DPIA, Art. 9 transparency, Art. 10 human oversight, Art. 11 human review) across the official Better Regulation document index, the consolidated EUR-Lex TEXT and analyses by CMS, LexisNexis, CXC, Freshfields and EU-OSHA; the EUR-Lex ELI permalink is the canonical official source and resolves (HTTP 202 anti-bot challenge), though its JS-rendered body could not be machine-extracted via fetch.
Directive (EU) 2024/2831 on improving working conditions in platform work addresses 4 contested AI-governance topics explicitly, 1 via general principles,.
Topics governed
- governsBiometric Identification— Directive (EU) 2024/2831, Article 7
Article 7paraphraseArticle 7 prohibits digital labour platforms from processing biometric data of persons performing platform work to establish identity by one-to-many comparison against a database, while permitting one
- governsAI in Employment— Directive (EU) 2024/2831, Chapter III (esp. Arts. 7-11) and Chapter II (employment-status presumption)
Article 10paraphraseThe Directive's core subject is AI in employment: it regulates automated monitoring and decision-making systems used to manage platform workers, requiring human oversight (Art. 10), human review of si
- governsTransparency Obligations— Directive (EU) 2024/2831, Article 9 (with Arts. 7-8)
Article 9paraphraseArticle 9 requires digital labour platforms to inform persons performing platform work and their representatives about the use, categories, parameters and effects of automated monitoring systems and a
- governsIndividual Redress— Directive (EU) 2024/2831, Article 11
Article 11paraphraseArticle 11 gives platform workers a right to a written explanation of significant automated decisions and to human review and contestation, and provides that decisions to restrict, suspend or terminat
- implicitAgentic AI Governance— Directive (EU) 2024/2831, Articles 9-11
Article 10paraphraseAutomated decision-making systems that autonomously allocate tasks, set pay, monitor and discipline platform workers function as agentic management tools; the Directive subjects them to operative tran
Cross-jurisdiction comparison
How peer instruments treat the topics Directive (EU) 2024/2831 on improving working conditions in platform work governs.
| Topic | EU-AIA-2024 | US-EO-14110 | US-EO-14179 | UK-WHITEPAPER-2023 | CN-GENAI-2023 | G7-HIROSHIMA | OECD-AI-PRIN | COE-AI-CONV | UN-RES-2024 | NIST-AI-RMF | BLETCHLEY-2023 | SEOUL-2024 | NIST-AI-RMF-GENAI | CA-SB-1047 | IN-DPDP-2023 | BR-AIBILL-2024 | ASEAN-AI-GUIDE-2024 | AU-AI-STRATEGY-2024 | ANTHROPIC-RSP-2024° | OPENAI-PREPAREDNESS-2023° | DEEPMIND-FSF-2024° | META-FRONTIER-2024° | UK-US-AISI-MOU-2024 | WH-VOLUNTARY-2023 | SG-MODEL-AI-2024 | JP-METI-AI-2024 | EU-GDPR-2016 | EU-GPAI-COP-2025 | OMB-M-24-10 | GSA-AI-GUIDE-2024 | DOD-RAI-2022 | FEDRAMP-AI-2024 | DFARS-252-204 | CA-SB-53 | CA-SB-243 | CA-SB-942 | EU-PLD-2024 | UNESCO-AI-ETHICS-2021 | CN-DEEPSYN-2022 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Biometric Identification | governs | implicit | silent | implicit | silent | silent | silent | implicit | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | governs | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | implicit | governs |
| AI in Employment | governs | implicit | silent | implicit | silent | silent | silent | implicit | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | silent | implicit | silent | silent | silent | silent | silent | silent | silent | silent | governs | silent |
| Transparency Obligations | governs | implicit | silent | implicit | conflicts | governs | governs | governs | implicit | governs | implicit | governs | governs | implicit | implicit | governs | governs | silent | governs | implicit | implicit | governs | implicit | governs | governs | governs | governs | governs | governs | governs | governs | governs | silent | governs | governs | governs | implicit | governs | governs |
| Individual Redress | governs | silent | silent | implicit | governs | silent | governs | governs | silent | implicit | silent | silent | implicit | implicit | governs | governs | silent | silent | silent | silent | silent | silent | silent | silent | implicit | implicit | governs | silent | governs | implicit | implicit | implicit | silent | implicit | governs | silent | governs | governs | governs |
°= industry self-imposed voluntary framework. Comparing a voluntary code's "governs" tint with a binding regulation's "governs" tint flattens the legal-force distinction; use the instrument-page banner for the operative status of each.
Evidence & methods — how this article was reviewed
Source appraisal — 74 sources across 6 types
| Source type | Authority | Count |
|---|---|---|
| Peer-reviewed✦ 48 AI | Primary / peer-reviewed | 49 |
| Preprint✦ 6 AI | Institutional | 20 |
| Research institute✦ 2 AI | Institutional | 2 |
| Working paper✦ 1 AI | Institutional | 1 |
| Civil society✦ 1 AI | Contextual | 1 |
| Think tank✦ 1 AI | Contextual | 1 |
Authority is an editorial classification by source type — not a quality score for any individual work, and not external peer review. ✦ AI-generated summaries are labelled, never dropped.
Review methods
- Review question
- How does Directive (EU) 2024/2831 on improving working conditions in platform work govern AI across the tracked governance topics, and what cited evidence supports each classification?
- Review model
- Living evidence mapping (scoping-review idiom) — continuously updated and source-grounded. Not a registered systematic review and not externally peer-reviewed.
- Updated through
- 2026-06-22
- Source base
- Primary legal/regulatory and standards sources; peer-reviewed and preprint academic literature (via DOI/arXiv); institutional and civil-society reports. Source types are classified in the source-appraisal table on this page.
- Search & selection
- Sources are identified by continuous monitoring of the primary regulators and standards bodies in the catalog, plus a literature sweep over open scholarly indexes (arXiv, Crossref) seeded from the core papers and extended by citation snowballing, refreshed to the review date below. Candidates are screened for topical relevance and source verifiability; items with broken or unverifiable links, or that do not support the claim they are attached to, are excluded. No registered protocol or PRISMA flow diagram is maintained — this is a living, continuously-updated evidence map, not a one-time date-bounded screened review.
- Provenance (this article)
- This article charts 74 literature sources drawn from Policy Window's continuously-screened literature corpus (the full corpus is at /wiki/literature). Each was relevance-tagged to the article's topics and verifiability-checked at intake; items with broken or unverifiable links, or that do not support the claim they are attached to, are excluded. Coverage is charted per instrument×topic cell, each verdict anchored to a named provision. A one-time identified→screened→excluded tally is NOT maintained — this is a living map, refreshed to the review date below, not a date-bounded one-pass screen.
- Inclusion
- A claim is included only when it traces to a cited primary or published source; coverage classifications are anchored to a named provision or document.
- Exclusion
- Unsourced assertions, broken or unverifiable links, and sources that do not support the claim they are attached to are excluded.
- Appraisal
- Sources are classified by source-type authority (see the source-appraisal table) — structured editorial self-classification, not external peer review.
- Synthesis
- Descriptive mapping of the instrument's coverage across topics, plus its cited literature base.
- Limitations
- English-language and editorial-capacity coverage asymmetries; reliance on official sources for legal status; where AI-drafted or AI-assisted prose is present it is labelled inline with its drafting provenance and reviewer (charter §7.9/§7.10). This is not externally peer-reviewed scholarship.
- Funding & competing interests
- No external funding; produced by Policy Window editorial. No competing interests declared. AI-assisted drafting, where present, is disclosed per charter §7.9/§7.10.
How to cite this article
Cite this article
8 formats · 1-click copyPersistent identifier: https://policywindow.org/wiki/eu-platform-work-directive — committed-stable URL with content-versioning via ?asOf= (rollout pending per methodology §7). DOIs via Zenodo are on the roadmap.
Evidence base
74 academic & grey-literature sources on the topics this instrument addresses (not commentary on the instrument itself) — catalogued metadata with a primary link; one-line findings are ✦ AI-generated summaries, labeled as such (charter §7.9). Browse the full literature index.
- Facial recognition technology in law enforcement: a scoping review of existing empirical studies Peer-reviewed✦ AIScoping review mapping the empirical evidence base on law-enforcement FRT, identifying gaps in research on real-world identification use and its governance.
- Governing AI Agents Preprint✦ AIUses "agency law and theory to identify and characterize problems arising from AI agents" and proposes governance infrastructure built on inclusivity, visibility, and liability.
- Infrastructure for AI Agents Peer-reviewed✦ AIProposes "agent infrastructure": external technical systems for attributing actions "to specific agents, their users, or other actors," shaping interactions, and remediating harms.
- Multi-Agent Risks from Advanced AI Research institute✦ AIIdentifies three failure modes of advanced multi-agent systems — "miscoordination, conflict, and collusion" — plus seven risk factors, posing challenges distinct from single-agent AI.
- Global perspectives on regulating facial recognition technology utilization for criminal justice arrests Peer-reviewed✦ AIComparative study of facial-recognition regulation for arrests across democracies finds frameworks are inconsistent and unclear, raising privacy and civil-liberties risks.
- Identifying Algorithmic Decision Subjects' Needs for Meaningful Contestability Peer-reviewed✦ AIEmpirically elicits what decision subjects need for contestation to be 'meaningful', informing the design of effective remedies and appeal mechanisms for ADM.
- Two Means to an End Goal: Connecting Explainability and Contestability in the Regulation of Public Sector AI Preprint✦ AIInterview study with 14 regulation experts distinguishes judicial vs non-judicial and individual vs collective contestation channels for public-sector AI remedies.
- Fair Work for Platform Workers: Lessons from the EU Directive and Beyond Peer-reviewed✦ AIAnalyzes the 2024 EU Platform Work Directive through Fairwork evidence, assessing its employment-status and algorithmic-management provisions and charting a path toward a proposed ILO platform-work Convention.
- Algorithm-facilitated discrimination: a socio-legal study of the use by employers of artificial intelligence hiring systems Peer-reviewed✦ AIEmpirical socio-legal study of employer AI hiring systems showing how design and deployment choices generate discrimination that current anti-discrimination law struggles to reach.
- Authenticated Delegation and Authorized AI Agents Preprint✦ AIIntroduces a framework for authenticated, authorized, and auditable delegation to AI agents by extending OAuth 2.0/OpenID Connect, maintaining accountability chains for agent actions.
- AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents Peer-reviewed✦ AIProvides a 440-task benchmark across 11 harm categories measuring whether LLM agents resist or comply with harmful multi-step tool-use tasks, grounding safety-evaluation regimes for agents.
- Better together? Human oversight as means to achieve fairness in the European AI Act governance Peer-reviewed✦ AIExamines whether Article-14 human oversight of high-risk/autonomous AI can actually deliver fairness, probing the limits of human-in-the-loop as a governance mechanism.
+ 62more across this instrument's topics — see the literature index.
References
- Directive (EU) 2024/2831 of the European Parliament and of the Council of 23 October 2024 on improving working conditions in platform work, OJ L, 2024/2831, 11.11.2024
- Directive (EU) 2024/2831, Article 7
- Directive (EU) 2024/2831, Chapter III (esp. Arts. 7-11) and Chapter II (employment-status presumption)
- Directive (EU) 2024/2831, Article 9 (with Arts. 7-8)
- Directive (EU) 2024/2831, Article 11
- Directive (EU) 2024/2831, Articles 9-11
Article tools — track changes, suggest an edit
View history — every captured revision of this article · What links here
Per-audience views
- Provisions →Article-by-article obligation breakdown for procurement + RFP authors.
- Disclosure form →Vendor-disclosure questionnaire derived from this instrument's operative obligations.
- Harm narratives →Documented harms relevant to this instrument's topics, for civil-society advocacy.
- Briefing pack →Journalist-ready summary with quotes + dates + primary-source links.
Does this instrument’s approach work? — the social-science evidence
Aggregated over the 5 topics this instrument governs: whether each harm is empirically real, and whether the peer-reviewed evidence shows governance reduces it. The badge is the epistemic status of the evidence— “thin”/“absent” efficacy evidence is itself a finding (the “second silence”). Each epistemic-status label is Policy Window's editorial assessment of the cited evidence base (a structured classification), not a verdict any single source issues.
Of the 5 governed topics with a social-science evidence review, evidence that governance reduces the harm is established for 0, contested for 0, thin for 1, and absent for 4 — for most, no replicated study yet shows this instrument's approach works (the "second silence").
Agentic AI Governance
The capability that agentic governance targets — autonomous multi-step action — is real and rapidly, measurably advancing: METR finds the task length AI agents complete at 50% reliability has doubled roughly every seven months for the past six years (about 50 minutes for frontier 2025 models), and the UK AI Security Institute's first Frontier AI Trends Report (Dec 2025, >30 systems) reports models now finish hour-long software tasks >40% of the time versus <5% in late 2023. The distinct realized HARM from agency (as opposed to the underlying model) is, however, thinly documented: on consequential real-world tasks agents still fail the majority — Gemini 2.5 Pro completed only 30.3% of TheAgentCompany's 175 professional tasks (OpenHands scaffold, project leaderboard) — so the agency-specific harm magnitude is early and context-dependent rather than established at scale.
Sources: Kwa, West, Becker et al. 2025 (METR; arXiv:2503.14499, 'Measuring AI Ability to Complete Long Tasks'); UK AI Security Institute 2025 (Frontier AI Trends Report, Dec 2025); Xu, Song, Zhou et al. 2024 (TheAgentCompany, arXiv:2412.14161); 30.3% figure per TheAgentCompany leaderboard (OpenHands)
There is no impact-evaluation evidence that agent-specific governance reduces agentic harm: the operative regimes — the EU GPAI Code of Practice (published July 2025, voluntary/non-binding), the Seoul Frontier AI Safety Commitments (2024, voluntary), and AISI agent evaluations — are 2024-25 vintage and have never been measured against an outcome. The scholarship itself has not settled the contested unit of regulation: Kolt (2025) argues for governing the agentic relationship via principal-agent and agency-law tools, while Chan, Ezell, Kaufmann et al. (2024) propose agent-specific visibility mechanisms (identifiers, real-time monitoring, activity logging) that remain proposal-stage and unevaluated — meaning the field has design proposals but, as with most frontier-AI rules, the evidence that any of them works is absent rather than merely thin.
Sources: Kolt 2025 ('Governing AI Agents', 101 Notre Dame L. Rev., forthcoming; arXiv:2501.07913); Chan, Ezell, Kaufmann et al. 2024 ('Visibility into AI Agents', ACM FAccT 2024, pp. 958-973; DOI 10.1145/3630106.3658948); EU AI Office 2025 (GPAI Code of Practice, July 2025); Seoul Frontier AI Safety Commitments 2024
Biometric Identification
Demographic accuracy disparities in facial recognition are robust and replicated. NIST's Face Recognition Vendor Test (189 algorithms, 18.27M images) found one-to-one false-positive rates for Asian and African-American faces elevated 10-100x over white males, with the highest one-to-many false positives for African-American women; Buolamwini & Gebru's Gender Shades found commercial gender-classification error up to 34.7% for darker-skinned women vs 0.8% for lighter-skinned men. Documented downstream harm includes at least 8-15 US wrongful arrests, nearly all of Black people. Honest caveat: magnitude is highly algorithm-dependent — the most accurate algorithms show small or statistically undetectable differentials — so the harm is real but not uniform across systems.
Sources: Grother, Ngan & Hanaoka 2019 (NISTIR 8280, FRVT Part 3: Demographic Effects); Buolamwini & Gebru 2018 (Gender Shades, PMLR 81); Hill 2020 / Williams v. City of Detroit (ACLU 2021)
Rigorous evidence that GOVERNANCE of biometric ID reduces the documented harms is sparse. The one quantitative impact evaluation of police facial-recognition policy (Johnson et al. 2024, difference-in-differences across 268 US cities) studies effects on violent crime — a crime-control outcome, not misidentification harm — from a single research group, and does not establish that any safeguard regime curbs wrongful identification. Direct evidence on procedural safeguards points the other way: in the known wrongful-arrest cases police are reported to have bypassed required corroboration/probable-cause standards, and the strongest documented enforcement levers are private-sector biometric-privacy laws — Illinois BIPA (e.g. Meta's $650M settlement) and the separate Texas CUBI law (a $1.4B Meta settlement) — which govern private actors, not the law-enforcement context where the arrests occur. No replicated study shows a specific regulatory regime measurably reduces demographic misidentification harm.
Sources: Johnson et al. 2024 (Cities, 'Police facial recognition applications and violent crime control in U.S. cities'); Harwell & Schaffer 2025 (Washington Post, 'Arrested by AI'); Illinois BIPA (Rosenbach v. Six Flags 2019; Meta $650M settlement 2021); Texas CUBI (Meta $1.4B settlement 2024)
AI in Employment
Discrimination and adverse outcomes in employment decisions are empirically well-established, and AI systems demonstrably reproduce them. The foundational field-experiment literature shows robust human baseline discrimination (Bertrand & Mullainathan 2004 found White-sounding names received 50% more callbacks), and AI-specific audits confirm the pattern: Amazon scrapped a recruiting tool that penalized resumes containing 'women's' (Dastin 2018), and a controlled resume-screening audit of language-model retrieval found systems favored White-associated names ~85% of the time and never preferred Black male-associated over White male-associated names (Wilson & Caliskan 2024). On the monitoring side, a meta-analysis (k=94, N≈23,461) found electronic performance monitoring reliably raises worker stress with no evidence of improved performance (Ravid et al. 2023). Honest caveat: measured disparities are highly model-, prompt-, and context-dependent, and most evidence comes from controlled audits and one firm's internal test rather than measured outcomes in live, at-scale hiring pipelines.
Sources: Bertrand & Mullainathan 2004 (American Economic Review 94(4):991-1013); Wilson & Caliskan 2024 (AAAI/ACM AIES; 'Gender, Race, and Intersectional Bias in Resume Screening via Language Model Retrieval'); Dastin 2018 (Reuters, 'Amazon scraps secret AI recruiting tool that showed bias against women'); Ravid, White, Tomczak & Behrend 2023 (Personnel Psychology 76:5-40)
There is no rigorous evidence that governing AI in employment reduces the documented harms; the central evaluated regime appears to fail at the compliance stage before any impact on bias can occur. NYC Local Law 144 — the first jurisdiction worldwide to mandate independent bias audits and public posting for automated employment decision tools — was directly studied across 391 employers and found to produce 'null compliance': the law's discretion makes it impossible to tell whether firms comply, with very few posting the required audits (Wright et al. 2024). Parallel qualitative work shows the audits themselves are undermined by missing demographic data, opaque aggregation, and 'test data' that does not reflect real use (Groves et al. 2024). No study links any AI-employment rule to a measured reduction in discriminatory hiring outcomes — the evidence that the rule works is itself missing, largely because mandated transparency artifacts (audit reports) are sparse, non-standardized, and unenforced.
Sources: Wright, Muenster, Vecchione, Metcalf & Matias et al. 2024 ('Null Compliance: NYC Local Law 144 and the Challenges of Algorithm Accountability', ACM FAccT '24); Groves, Metcalf, Kennedy, Vecchione & Strait 2024 ('Auditing Work: Exploring the New York City algorithmic bias audit regime', ACM FAccT '24); Ravid, White, Tomczak & Behrend 2023 (Personnel Psychology 76:5-40, on monitoring outcomes as the closest analogue evaluation evidence)
Individual Redress
The premise behind redress — that affected people lack meaningful recourse against automated decisions — is real, but the flagship instrument is weaker than commonly assumed. Wachter, Mittelstadt & Floridi (2017) show GDPR creates only a limited 'right to be informed,' not a binding 'right to explanation' of specific decisions; and controlled work finds the explanations actually delivered do not measurably improve lay decision accuracy over showing the bare AI prediction (Alufaisan et al. 2021; and a 2022 meta-analysis by Schemmer et al. — screening 393 articles down to 9 in the final analysis — reports 'no effect of explanations on users' performance compared to sole AI predictions,' even though XAI overall had a positive effect). Honest caveat: the legitimacy/dignity value of being heard is empirically well established in the procedural-justice tradition even where outcome accuracy is unchanged, so 'redress fails' depends on which aim is measured.
Sources: Wachter, Mittelstadt & Floridi 2017 (International Data Privacy Law 7(2):76); Alufaisan, Marusich, Bakdash, Zhou & Kantarcioglu 2021 (Proceedings of the AAAI Conference on AI 35(8):6618); Schemmer, Hemmer, Nitsche, Kühl & Vössing 2022 (AAAI/ACM AIES '22, meta-analysis)
There is no rigorous impact evaluation showing that mandated redress mechanisms (right-to-explanation, appeal, human-in-the-loop review) actually reduce erroneous or unfair automated decisions — the evidence that the rule works is itself missing. The closest experimental analogues are discouraging: explanations increase humans' acceptance of AI recommendations regardless of correctness (Bansal et al. 2021), and algorithm-in-the-loop oversight can introduce racial disparities and exhibit automation bias rather than reliably catching model errors (Green & Chen 2019). The procedural-justice literature (Tyler 1990; Lind & Tyler 1988) robustly supports a legitimacy and compliance benefit of fair process, but it measures perceived fairness, not reduction of the substantive decision harm redress is meant to cure.
Sources: Bansal, Wu, Zhou, Fok, Nushi, Kamar, Ribeiro & Weld 2021 (CHI '21); Green & Chen 2019 (Disparate Interactions, ACM FAT* '19); Tyler 1990 (Why People Obey the Law, Yale Univ. Press); Lind & Tyler 1988 (The Social Psychology of Procedural Justice, Plenum Press)
Transparency Obligations
Documentation artifacts (model cards, datasheets) are well-specified as proposals and are genuinely adopted, but the empirical premise that mandated disclosure produces meaningful transparency is contested. Selbst & Barocas (2018) argue inscrutability and non-intuitiveness are distinct problems and that disclosing rules does not resolve the latter, and large-scale audits find documentation is sparsely and unevenly completed: a systematic analysis of 32,111 Hugging Face model cards (Liang et al. 2024) found environmental-impact, limitations and evaluation sections least often filled, and Bhat et al. (2023, 45 practitioners) found a substantial gap between the documentation proposal and actual practice. Honest caveat: the documentation frameworks themselves are real and adopted, so the dispute is about whether disclosure conveys decision-relevant information, not whether the artifacts exist.
Sources: Selbst & Barocas 2018 (Fordham Law Review 87:1085-1139); Liang et al. 2024 (Nature Machine Intelligence, s42256-024-00857-z, 'Systematic analysis of 32,111 AI model cards'); Bhat et al. 2023 (CHI '23, 'Aspirations and Practice of ML Model Documentation', DOI 10.1145/3544548.3581518); Mitchell et al. 2019 (FAccT, Model Cards for Model Reporting); Gebru et al. 2021 (CACM 64(12):86-92, Datasheets for Datasets)
There is no rigorous impact evaluation showing that AI transparency mandates (model cards, training-data summaries) measurably reduce bias, misuse or accidents — the central regulatory assumption is empirically untested, partly because flagship mandates like EU AI Act Art. 53(1)(d) GPAI training-data summaries are only subject to AI Office enforcement/verification from 2 August 2026 (the obligation itself began 2 August 2025 for new models). The closest analogue, mandated consumer disclosure, shows small and context-dependent effects: Bollinger, Leslie & Sorensen (2011) found mandatory calorie posting cut average calories per transaction by about 6%, while Loewenstein, Sunstein & Golman (2014) review evidence that disclosure effects are frequently diminished or even reversed by limited attention and often change provider rather than recipient behavior. These are analogues, not AI studies; no study demonstrates that AI transparency disclosure achieves its stated downstream safety aims.
Sources: Bollinger, Leslie & Sorensen 2011 (AEJ: Economic Policy 3(1):91-128); Loewenstein, Sunstein & Golman 2014 (Annual Review of Economics 6:391-419, 'Disclosure: Psychology Changes Everything'); EU AI Act Art. 53(1)(d) GPAI training-data summary (obligation from 2 Aug 2025; AI Office enforcement from 2 Aug 2026)