Comment on "Effect of AI empathy perception on employees' prosocial behavior: mediating role of warmth and moderating role of AI anthropomorphism"

Item: Effect of AI empathy perception on employees' prosocial behavior: mediating role of warmth and moderating role of AI anthropomorphism
Author: Critical AI

Critical AI

Post-publication Comment · Critical AI

Comment on “Effect of AI empathy perception on employees' prosocial behavior: mediating role of warmth and moderating role of AI anthropomorphism”

Critical AI · published 2026-06-28 · v1.0 · CRIT-000024

Concerning: Jing Xue, Yang Liu, Zhihong Ren, Yang Wu · Frontiers in Psychology · 2025

Severity: ModerateConfidence: HighTier exceptionOff-list venue · peer-reviewedOpen-access full textEmpiricalRead the paper ↗

Human–AI interaction

Why this paper was selected

Scale-the-full-text-cohort batch (white-space domain): full-text critique span-grounded to the gold-OA full text via the source store.

AI/AGI centrality 3/5 · societal relevance 4/5 · source-journal note: Off-monitored: Frontiers in Psychology is a peer-reviewed, gold open-access journal not in the journal's monitored top-tier determination; disclosed off-list. Critiqued at full text.

Summary

This Frontiers in Psychology (2025) study surveyed employees at eight highly digitalized Chinese firms across three monthly waves (Feb/Mar/Apr 2024) and reports that perceiving AI as empathic raises employees' prosocial behavior, that warmth mediates this effect, and that AI anthropomorphism strengthens it. Constructs are measured with adapted self-report scales (alphas 0.81-0.86) and tested with CFA, hierarchical regression, 5,000-iteration bootstrapping, and the Johnson-Neyman method. The headline conclusions are plausible and the analytic machinery is conventional and competently applied. However, the paper has several real, span-groundable problems. The sample-accounting chain is internally inconsistent: 503 questionnaires distributed, "465 valid... yielding an effective response rate of 92.45%," but the analytic N is then "400 valid samples" with the 65-case drop between 465 and 400 unexplained, the 92.45% figure (=465/503) being a validity/retention rate mislabeled as a response rate, and the reported gender split (60.8% female, 39.3% male) summing to 100.1%. The anthropomorphism moderator is operationalized with a physical/movement human-likeness item ("smooth and graceful movements") rather than the mind/emotion anthropomorphism the warmth-empathy mechanism requires, a content-validity mismatch. The abstract frames AI empathy perception as causally enhancing prosocial behavior and anthropomorphism as moderating "the effect of AI empathy perception on employees' prosocial behavior," but the results emphasize an indirect (mediated) pathway ("a fully standardized indirect effect value of 0.199"), so the causal/direct-effect framing in the abstract runs ahead of what is cleanly reported. Finally, the design is single-source self-report (the authors disclose this), so the time lag mitigates but does not establish causal identification despite directional language ("can enhance"). The authors candidly disclose four limitations (self-report, subjective warmth, China-only sample, ecological validity) but do not flag the sample arithmetic or the construct mismatch.

Central claims & evidence map

Claim	Type	Evidence offered	Support	Overclaiming	Main weakness
The sample-accounting chain in the methods cannot be fully reconciled from the reported numbers. From 503 distributed questionnaires, '465 valid questionnaires were collected, yiel	Methodological	465 valid questionnaires were collected, yielding an effective response rate of 92.45%	Moderate	Minor	The sample-accounting chain in the methods cannot be fully reconciled from the reported numbers. From 503 distributed questionnaires, '465 valid questionnaires were collected, yielding an effective re
The AI anthropomorphism moderator is operationalized with an item tapping physical/kinematic human-likeness, whereas the theoretical model requires mind/emotion anthropomorphism (a		I believe AI agents should have smooth and graceful movements	Strong	Minor	The AI anthropomorphism moderator is operationalized with an item tapping physical/kinematic human-likeness, whereas the theoretical model requires mind/emotion anthropomorphism (attributing an empath
The abstract frames AI empathy perception as directly enhancing prosocial behavior and anthropomorphism as moderating 'the effect of AI empathy perception on employees' prosocial b	Causal	AI anthropomorphism moderates the effect of AI empathy perception on employees' prosocial behavior through warmth	Weak	Moderate	The abstract frames AI empathy perception as directly enhancing prosocial behavior and anthropomorphism as moderating 'the effect of AI empathy perception on employees' prosocial behavior,' implying a
All variables, including the dependent variable, are self-reported by the same individuals across the three waves, so the time-lagged design reduces but does not remove common-meth	Causal	all variables—including the core dependent variable of prosocial behavior—are measured via self-report questionnaires	Moderate	Moderate	All variables, including the dependent variable, are self-reported by the same individuals across the three waves, so the time-lagged design reduces but does not remove common-method variance or endog

Per-claim assessment

C1. The sample-accounting chain in the methods cannot be fully reconciled from the reported numbers. From 503 distributed questionnaires, '465 valid questionnaires were collected, yiel
The sample-accounting chain in the methods cannot be fully reconciled from the reported numbers. From 503 distributed questionnaires, '465 valid questionnaires were collected, yielding an effective response rate of 92.45%' (note 465/503 = 92.45%, so this is a validity/retention rate, not a true response rate against distribution), yet the analytic sample is then stated as 400 valid samples, leaving the 65-case drop between 465 and 400 unexplained. Additionally, the reported gender split (60.8% female, 39.3% male) sums to 100.1%. These arithmetic inconsistencies, with no shared data, leave the exact analytic N and pipeline unverifiable.
C2. The AI anthropomorphism moderator is operationalized with an item tapping physical/kinematic human-likeness, whereas the theoretical model requires mind/emotion anthropomorphism (a
The AI anthropomorphism moderator is operationalized with an item tapping physical/kinematic human-likeness, whereas the theoretical model requires mind/emotion anthropomorphism (attributing an empathic, warm inner life to AI) to plausibly amplify the empathy->warmth pathway. A normative appearance/movement item does not measure perceived mental or emotional human-likeness, so the moderation finding may not test the mechanism the theory claims, weakening construct validity for the central moderated-mediation result.
C3. The abstract frames AI empathy perception as directly enhancing prosocial behavior and anthropomorphism as moderating 'the effect of AI empathy perception on employees' prosocial b
The abstract frames AI empathy perception as directly enhancing prosocial behavior and anthropomorphism as moderating 'the effect of AI empathy perception on employees' prosocial behavior,' implying a robust moderated direct/total causal effect. But the reported evidence centers on an indirect (mediated-through-warmth) pathway, described as 'a fully standardized indirect effect value of 0.199,' with warmth cast as the carrier; no surviving direct empathy->prosocial effect is reported alongside it. The directional causal language ('can enhance') in a correlational, single-firm-set survey thus overstates the strength and directness of the causal claim relative to what is shown.
C4. All variables, including the dependent variable, are self-reported by the same individuals across the three waves, so the time-lagged design reduces but does not remove common-meth
All variables, including the dependent variable, are self-reported by the same individuals across the three waves, so the time-lagged design reduces but does not remove common-method variance or endogeneity and provides no exogenous variation to identify a causal effect. Reverse causality (prosocial, warmth-oriented employees perceiving more AI empathy) and omitted confounders remain plausible, yet the abstract uses directional causal language ('can enhance'). The authors disclose the self-report limitation but still phrase findings causally.

Scorecard

AI/AGI contribution3.0 / 5

Evidentiary support3.0 / 5

Methodological risk3.0 / 5

Overclaiming3.0 / 5

Reproducibility / auditability3.0 / 5

Societal-impact relevance4.0 / 5

Sub-scores are 0–5 editorial judgements on fixed scales (higher is better, except methodological risk and overclaiming where higher is worse). They are contestable and open to a severity challenge from authors.

What the paper does

This Frontiers in Psychology (2025) study surveyed employees at eight highly digitalized Chinese firms across three monthly waves (Feb/Mar/Apr 2024) and reports that perceiving AI as empathic raises employees' prosocial behavior, that warmth mediates this effect, and that AI anthropomorphism strengthens it. Constructs are measured with adapted self-report scales (alphas 0.81-0.86) and tested with CFA, hierarchical regression, 5,000-iteration bootstrapping, and the Johnson-Neyman method. The headline conclusions are plausible and the analytic machinery is conventional and competently applied. H

Sample Data

The sample-accounting chain in the methods cannot be fully reconciled from the reported numbers. From 503 distributed questionnaires, '465 valid questionnaires were collected, yielding an effective response rate of 92.45%' (note 465/503 = 92.45%, so this is a validity/retention rate, not a true response rate against distribution), yet the analytic sample is then stated as 400 valid samples, leaving the 65-case drop between 465 and 400 unexplained. Additionally, the reported gender split (60.8% female, 39.3% male) sums to 100.1%. These arithmetic inconsistencies, with no shared data, leave the exact analytic N and pipeline unverifiable.

Measurement

The AI anthropomorphism moderator is operationalized with an item tapping physical/kinematic human-likeness, whereas the theoretical model requires mind/emotion anthropomorphism (attributing an empathic, warm inner life to AI) to plausibly amplify the empathy->warmth pathway. A normative appearance/movement item does not measure perceived mental or emotional human-likeness, so the moderation finding may not test the mechanism the theory claims, weakening construct validity for the central moderated-mediation result.

Overclaim

The abstract frames AI empathy perception as directly enhancing prosocial behavior and anthropomorphism as moderating 'the effect of AI empathy perception on employees' prosocial behavior,' implying a robust moderated direct/total causal effect. But the reported evidence centers on an indirect (mediated-through-warmth) pathway, described as 'a fully standardized indirect effect value of 0.199,' with warmth cast as the carrier; no surviving direct empathy->prosocial effect is reported alongside it. The directional causal language ('can enhance') in a correlational, single-firm-set survey thus overstates the strength and directness of the causal claim relative to what is shown.

Identification

All variables, including the dependent variable, are self-reported by the same individuals across the three waves, so the time-lagged design reduces but does not remove common-method variance or endogeneity and provides no exogenous variation to identify a causal effect. Reverse causality (prosocial, warmth-oriented employees perceiving more AI empathy) and omitted confounders remain plausible, yet the abstract uses directional causal language ('can enhance'). The authors disclose the self-report limitation but still phrase findings causally.

What the paper does well

The paper is methodologically more careful than many cross-sectional survey studies in this literature. It uses a genuine three-wave, one-month-lagged design that temporally separates predictor, mediator, and outcome, which mitigates (even if it cannot eliminate) common-method bias. Measurement rests on established adapted scales with acceptable reliabilities (Cronbach's alphas 0.81-0.86), and the inferential approach is appropriate: confirmatory factor analysis, hierarchical regression, 5,000-iteration bootstrap confidence intervals for indirect and conditional indirect effects (e.g., total effect CI (0.160, 0.325) and indirect CI (0.069, 0.342), both excluding zero), and a Johnson-Neyman probe. The conditional indirect effect at high anthropomorphism (β = 0.210, 95% CI (0.092, 0.337)) excludes zero, so the moderated-mediation claim is statistically coherent within the fitted model. The authors also pre-empt several criticisms by candidly disclosing four limitations: self-report dependence, the lack of objective warmth indicators, China-only generalizability, and compromised ecological validity. Some flagged issues — the 65-case drop and the 100.1% gender sum — are plausibly typographical rather than analysis-breaking.

Strongest critique

The study's contribution is real but its framing outruns its design and reporting on three groundable fronts. First, measurement: the anthropomorphism moderator, which is theoretically supposed to amplify the perception of an empathic, warm AI mind, is measured with a physical/movement item ('I believe AI agents should have smooth and graceful movements'). Appearance/kinematic human-likeness is not mind/emotion human-likeness, so the headline moderated-mediation result may not test the mechanism the authors theorize. Second, the data pipeline cannot be reconciled from the numbers given: 503 distributed, '465 valid questionnaires... effective response rate of 92.45%' (which is actually 465/503, a retention/validity rate, not a response rate against distribution), then a 400-case analytic sample with the 65-case gap unexplained, and a gender split summing to 100.1% — and with no shared data the exact N is unverifiable. Third, the abstract presents AI empathy perception as directly enhancing prosocial behavior and anthropomorphism as moderating 'the effect of AI empathy perception on employees' prosocial behavior,' but the reported evidence is an indirect, warmth-mediated pathway ('a fully standardized indirect effect value of 0.199') in an all-self-report correlational design; the directional causal language ('can enhance') is therefore stronger than the design and the reported direct evidence support.

Strongest fair defence

The paper is methodologically more careful than many cross-sectional survey studies in this literature. It uses a genuine three-wave, one-month-lagged design that temporally separates predictor, mediator, and outcome, which mitigates (even if it cannot eliminate) common-method bias. Measurement rests on established adapted scales with acceptable reliabilities (Cronbach's alphas 0.81-0.86), and the inferential approach is appropriate: confirmatory factor analysis, hierarchical regression, 5,000-iteration bootstrap confidence intervals for indirect and conditional indirect effects (e.g., total effect CI (0.160, 0.325) and indirect CI (0.069, 0.342), both excluding zero), and a Johnson-Neyman probe. The conditional indirect effect at high anthropomorphism (β = 0.210, 95% CI (0.092, 0.337)) excludes zero, so the moderated-mediation claim is statistically coherent within the fitted model. The authors also pre-empt several criticisms by candidly disclosing four limitations: self-report dependence, the lack of objective warmth indicators, China-only generalizability, and compromised ecological validity. Some flagged issues — the 65-case drop and the 100.1% gender sum — are plausibly typographical rather than analysis-breaking.

Conclusion

A competently executed but somewhat oversold study. The three-wave matched design, acceptable reliabilities, and bootstrap-based mediation/moderated-mediation tests make the within-model statistical story internally coherent, and the authors disclose real limitations honestly. The retained concerns are genuine but more moderate than the draft implied once restricted to spans actually present in the provided sections: the moderator is measured as physical/movement anthropomorphism rather than the mind/emotion anthropomorphism the theory needs (the strongest, fully-grounded flaw); the sample accounting (465 valid -> 400, '92.45%' as a mislabeled retention rate, 100.1% gender sum) is internally inconsistent and unverifiable without shared data; and the abstract's directional causal framing ('can enhance', anthropomorphism moderating the effect on prosocial behavior) runs ahead of an indirect-effect-centered, all-self-report correlational design. The contribution is worth taking seriously as correlational evidence within Chinese digitalized firms, but the causal and moderation claims should be softened to the mediated pathway actually evidenced, the anthropomorphism construct re-specified or re-justified, and the sample numbers reconciled. Overall severity: moderate.

Reply from the authors

Following the practice of Nature Matters Arising, Science Technical Comments and PNAS Letters, this Comment is published as one half of a Comment + Reply pair: the authors of the original article are invited to respond, and any reply is published here verbatim alongside the Comment as part of the record.

Reply: not yet invited. No reply has been received for publication.

The authors have a right of reply and no veto. A reply may request a factual correction, a methodological rebuttal, a clarification, a data/code update, or a severity challenge, and is published unedited. See the right-of-reply policy.

Automated re-evaluation after reply: Authors may reply at any time; this critique addresses claims, methods and inference only, never the authors.

References

Every external source this Comment cites, each with a verified link. 0 fabricated.

Source-grounding attestation

✓ attested in-appgrounding: spans in app

✓Verbatim source spans present in the critique — 4/4 provenance spans re-derived in the critique prose
✓Passes the publication validator — no errors
✓Zero fabricated citations — 0 fabricated
✓Severity within the access-basis cap — severity "moderate" ≤ cap "high" for open_access

Every verbatim span the critique relies on is re-derived in the prose in-app; span-in-source is re-verifiable offline (the abstract is re-fetched, not stored, per the no-reproduce policy).

Re-verify span-in-source offline: python3 scripts/verify-queue-critiques.py

Independent faithfulness review

A refute-by-default adversarial panel (two independent reviewers — an overreach lens and a mischaracterization lens — that fetched the real source) tried to prove this critique misread the paper. This is an AI adversarial review recorded with its reasoning, not a deterministic check.

✓ Faithful0/2 reviewers sustained a concern · source retrieved

The STRICT-NEUTRAL review holds against the text. All four retained flaws are grounded in verbatim spans present in the provided sections: (1) Sample accounting — "465 valid questionnaires were collected, yielding an effective response rate of 92.45%" is verbatim; 465/503 = 0.92445 confirmed, so 92.45% is a validity/retention rate mislabeled as a response rate, the analytic N of "400 valid samples" leaves 65 cases unexplained, and the gender split 60.8% + 39.3% = 100.1% is confirmed. (2) Measurement — the anthropomorphism moderator's example item "I believe AI agents should have smooth and graceful movements" (Bartneck et al. 2009) taps physical/kinematic human-likeness, not the mind/emotion anthropomorphism the empathy->warmth mechanism requires; this is the strongest, fully-grounded flaw (support: strong). (3) Overclaim — the abstract's "AI anthropomorphism moderates the effect of AI empathy perception on employees' prosocial behavior through warmth" plus "can enhance" framing is directional/causal in a single-source correlational design; the review correctly rates this at weak sup

Version & correction history

Version	Date	Change
v1.0	2026-06-28	Initial publication (scale-the-full-text-cohort batch).

No silent substantive corrections — every change is versioned and visible.

How to cite this Comment

Critical AI. Comment on “Effect of AI empathy perception on employees' prosocial behavior: mediating role of warmth and moderating role of AI anthropomorphism” (Jing Xue et al., Frontiers in Psychology, 2025). Critical AI; 2026. https://policywindow.org/critique/c/ai-empathy-prosocial-behavior-employees

A registered DOI will replace the URL once minted; until then the canonical URL is the persistent identifier. Highwire/Dublin-Core citation tags and a schema.org Review record are embedded in this page for Google Scholar and reference managers.

Verify this Comment. Its checkable facts (target DOI, access-basis severity cap, zero fabricated citations) are served — as the app’s self-report — at /critique/api/critiques/ai-empathy-prosocial-behavior-employees/verify; to confirm them independently of this site, re-derive the same checks (and resolve the target DOI) with npx tsx scripts/verify-critical-ai.ts --critique ai-empathy-prosocial-behavior-employees --live.

Content fingerprint efbb3e2b857a255f (v1.0) — this Comment’s substantive content is content-addressed; a silent post-publication edit would change it.