Post-publication Comment · Critical AI
Comment on “Factors influencing the adoption of generative artificial intelligence into classroom teaching by university teachers: An empirical study using SPSS PROCESS macros”
Critical AI · published 2026-06-29 · v1.0 · CRIT-000029
Concerning: Yong Xiang, Chenxin Yang, Zhigang Jin, Wanshu Zhao · PLOS One · 2025
Why this paper was selected
Autonomous production cycle (G101), deepening the education domain: a full-text critique of a PLS-SEM study of university teachers' generative-AI adoption, span-grounded to the gold-OA full text via the source store.
AI/AGI centrality 3/5 · societal relevance 4/5 · source-journal note: Off-monitored: PLOS ONE is a peer-reviewed, gold open-access journal not in the journal's monitored top-tier determination; disclosed off-list. Critiqued at full text via the source store.
Summary
This PLOS One paper surveys 513 university teachers in China with a one-month, all-self-report questionnaire and SPSS PROCESS macros (Models 4 and 61) to test a social-cognitive-theory model of why teachers adopt generative AI. It reports adequate reliability/validity (Cronbach's alpha, CR, AVE, Fornell-Larcker, the stricter HTMT) and finds all six hypotheses supported. A full-text adversarial convergence panel returned a UNANIMOUS survives verdict — the defender could not restore any point. Four span-exact flaws hold. (1) The paper draws causal/mechanistic conclusions ('verify the mechanism', and it justifies its method on the grounds that SEM 'focuses on analyzing causal relationships between variables') from a single-wave, cross-sectional, all-self-report design that cannot establish causal direction, rule out reverse/reciprocal causation (social cognitive theory itself posits reciprocal determinism), or exclude common-method variance — and no CMV test is reported. (2) The sample is internally contradictory on its central eligibility criterion: eligibility is stated as ages 22–45, yet the Limitations report a high percentage aged 36–49, a band exceeding the stated ceiling. (3) Reproducibility is limited: the raw data are withheld, and the manuscript carries verifiability problems a reader cannot resolve (the reference number [24] is attached to two different works; the results text contains a garbled statistic, 'the positive OE effect on OE is significant'). (4) An unsupported absolute novelty claim ('there is no research on teachers' self-efficacy as a factor affecting teachers' acceptance of technology') is contradicted by the paper's own TAM/TPB and self-efficacy citations. The authors do credibly disclose the self-report subjectivity, generalizability limits, and skewed age distribution, and use a conventional PLS-SEM reliability/validity workflow with 5,000 bootstrap resamples — genuine strengths that bear on measurement quality but cannot repair causal inference, the age contradiction, the withheld data, or the false universal negative.
Central claims & evidence map
| Claim | Type | Evidence offered | Support | Overclaiming | Main weakness |
|---|---|---|---|---|---|
| Causal/mechanistic conclusions are drawn from a single-wave, cross-sectional, all-self-report design that cannot license them. | Causal | Structural equation modeling focuses on analyzing causal relationships between variables | Weak | Major | Cross-sectional all-self-report data cannot establish the causal mechanisms the paper claims; no CMV test, lag, or manipulation. |
| The sample is internally contradictory on its own central eligibility criterion. | Descriptive | this study has a high percentage of college teachers aged 36–49 years old | Weak | Moderate | A high-percentage age band (36–49) that exceeds the paper's own 22–45 eligibility ceiling, unreconciled in the text. |
| Reproducibility is limited: raw data are withheld and the manuscript carries verifiability problems a reader cannot resolve. | Descriptive | the raw data cannot be publicly disclosed | Weak | None | Withheld raw data plus in-text citation/statistic inconsistencies block independent verification. |
| An unsupported absolute novelty claim is contradicted by the paper's own citations. | Descriptive | there is no research on teachers’ self-efficacy as a factor affecting teachers’ acceptance of technology | Unsupported | Major | A universal-negative novelty claim with no supporting search, contradicted by the paper's own reference list. |
Per-claim assessment
C1. Causal/mechanistic conclusions are drawn from a single-wave, cross-sectional, all-self-report design that cannot license them.
The paper sets out to 'verify the mechanism of the influence of each factor' and justifies its method on the grounds that 'Structural equation modeling focuses on analyzing causal relationships between variables,' but all constructs (self-efficacy, outcome expectations, self-improvement, external environment, willingness) were measured simultaneously in a one-month questionnaire window from a convenience sample, with no manipulation, no temporal lag, and no behavioural outcome. PROCESS Models 4/61 fit regression-based mediation/moderation coefficients with bootstrap CIs; they assume the causal ordering rather than test it, are equally consistent with reverse or reciprocal causation (which social cognitive theory itself posits), and cannot exclude common-method variance from all-self-report measures — and no CMV test (Harman, marker variable) is reported. The directional conclusions therefore outrun the design.
C2. The sample is internally contradictory on its own central eligibility criterion.
The Methods restrict respondents to college teachers aged 22–45, yet the Limitations report that the study has a high percentage of teachers aged 36–49 — a band whose upper end (46–49) exceeds the stated eligibility ceiling. Either the criterion was not enforced, the reported distribution is wrong, or the categories were mislabelled; the full text never reconciles them. The realized sample composition is therefore not reliably known, undermining claims about who the findings generalize to, and the convenience frame (emails harvested from university homepages) further limits representativeness.
C3. Reproducibility is limited: raw data are withheld and the manuscript carries verifiability problems a reader cannot resolve.
The raw data are withheld ('the raw data cannot be publicly disclosed'), so the PROCESS analyses cannot be independently rerun. Compounding this, the manuscript carries verifiability problems a reader cannot resolve from the text: the reference number [24] is attached to two different works in the same paragraph (Zhang & Qian on adolescent academic performance; Rahmati on pronunciation instruction, with the reference list showing [24] = Rahmati only), and the results text contains a garbled statistic ('the positive OE effect on OE is significant'). These make it difficult to verify which sources and which numbers support which claims. (Reported factually as verifiability defects in the text, not as any judgement of the authors.)
C4. An unsupported absolute novelty claim is contradicted by the paper's own citations.
The paper asserts a universal negative — 'there is no research on teachers' self-efficacy as a factor affecting teachers' acceptance of technology' — with no citation or systematic search to justify it, despite the paper itself citing the Technology Acceptance Model, the Theory of Planned Behavior, SEM studies of teachers' ICT adoption, teacher self-efficacy literature, and (in its own reference list) a generative-AI-acceptance-and-self-efficacy study. The sweeping 'no research exists' framing inflates the study's originality and is partly self-contradicted by the paper's own references; a charitable 'no AIGC-specific study' reading softens but does not rescue the literal text.
Scorecard
Sub-scores are 0–5 editorial judgements on fixed scales (higher is better, except methodological risk and overclaiming where higher is worse). They are contestable and open to a severity challenge from authors.
What the paper does
A PLS-SEM study: 513 Chinese university teachers, a one-month all-self-report questionnaire (March–April 2024), SPSS PROCESS Models 4/61, testing a social-cognitive-theory model of generative-AI ('AIGC') adoption. Reliability/validity reported (alpha, CR, AVE, Fornell-Larcker, HTMT); all six hypotheses supported.
Statistical inference — causal claims from cross-sectional self-report
The paper claims to 'verify the mechanism' and grounds its method on SEM analysing 'causal relationships', but the design is single-wave, cross-sectional, all-self-report, with no manipulation, lag, behavioural outcome, or common-method-variance test. PROCESS assumes the causal ordering rather than testing it; reverse/reciprocal causation (which SCT itself posits) and common-method variance are unaddressed, so the directional conclusions outrun the data.
Sample / data — an internal eligibility contradiction
Eligibility is stated as ages 22–45, yet the Limitations report a high percentage aged 36–49 — exceeding the stated ceiling and unreconciled in the text. The realized sample composition is therefore not reliably known, and the convenience frame further limits representativeness.
Reproducibility — withheld data + in-text inconsistencies
Raw data are withheld, so the analyses cannot be independently rerun, and the manuscript carries verifiability problems a reader cannot resolve: the reference number [24] is attached to two different works, and the results text contains a garbled statistic ('the positive OE effect on OE is significant'). Reported as verifiability defects in the text, not as any judgement of the authors.
Overclaiming — an unsupported universal-negative novelty claim
The paper asserts 'there is no research on teachers' self-efficacy as a factor affecting teachers' acceptance of technology' — an uncited universal negative contradicted by the paper's own TAM/TPB and teacher-self-efficacy citations. A 'no AIGC-specific study' reading softens but does not rescue the literal claim.
What the paper does well
The reliability/validity workflow is conventionally complete (Cronbach's alpha 0.849 with items >0.700, CR>0.7, AVE>0.5, Fornell-Larcker, and the stricter HTMT<0.85), it cites appropriate methodological authorities (Hair et al.; Henseler et al.; Kock & Hadaya; Hayes), uses 5,000 bootstrap resamples with 95% CIs, and n=513 is adequate for the model. The authors also honestly disclose the self-report subjectivity, the generalizability limits, and the skewed age distribution, and propose concrete future remedies. These strengths are real but bear on measurement quality — none repairs the cross-sectional causal inference, the age contradiction, the withheld data plus in-text inconsistencies, or the universal-negative novelty claim.
Strongest critique
The single most serious problem is the mismatch between the study's causal/mechanistic claims and its cross-sectional, all-self-report design: it sells itself as verifying 'the mechanism of the influence of each factor' and justifies its method on the grounds that SEM 'focuses on analyzing causal relationships between variables,' but every construct was measured simultaneously in a one-month questionnaire window from a convenience sample, with no manipulation, no temporal lag, no behavioural outcome, and no common-method-variance test. PROCESS Models 4/61 assume the causal ordering rather than test it; the chains are equally consistent with reverse or reciprocal causation (which social cognitive theory itself posits), and common-method variance can manufacture the observed associations. The directional conclusions are overstated relative to what a cross-sectional correlational design supports — and this sits alongside a self-contradictory age criterion, withheld data with in-text citation/statistic inconsistencies, and an uncited universal-negative novelty claim.
Strongest fair defence
The paper does several conventional things correctly and is transparent about some limits. It reports a defensible PLS-SEM reliability/validity workflow (Cronbach's alpha, CR>0.7, AVE>0.5, Fornell-Larcker, the stricter HTMT<0.85), cites appropriate authorities (Hair et al.; Henseler et al.; Kock & Hadaya for minimum sample size; Hayes for PROCESS), and uses 5,000 bootstrap resamples with 95% CIs — standard best practice for indirect-effect inference, with n=513 adequate for the model complexity. Crucially, the authors do not hide the design's weaknesses: the Limitations openly state the data 'relied on self-reporting ... which is somewhat subjective', acknowledge the limits 'restrict the generalizability of the findings', flag the skewed age distribution, and propose concrete future remedies. Faulting cross-sectional inference is fair, but the authors deserve credit for disclosing the self-report and generalizability constraints rather than concealing them.
Conclusion
A publishable-genre but methodologically weak adoption-intention study whose conclusions should be read as exploratory correlational associations, not the causal mechanisms it claims. A full-text convergence panel returned a unanimous survives verdict (the defender could not restore any point). Four span-exact flaws hold: causal overreach from a one-month cross-sectional all-self-report design; a sample whose 36–49 age description contradicts its own 22–45 eligibility rule; reproducibility limits (withheld raw data, a double-assigned citation [24], a garbled results statistic); and an uncited universal-negative novelty claim the paper's own citations undercut. The genuine strengths (standard PLS reliability/validity reporting, 5,000-resample bootstrap CIs, honest disclosure of self-report and generalizability limits) are real but bear on measurement quality and cannot offset the reproducibility and causal-inference problems. Overall severity high, driven primarily by the reproducibility/verifiability issues and the causal overclaiming rather than any single fatal statistical error. Procedural note: produced by the autonomous production cycle (G101); every span independently verified an exact substring of the gold-OA full text; targets claims, methods and inference only, never the authors.
Reply from the authors
Following the practice of Nature Matters Arising, Science Technical Comments and PNAS Letters, this Comment is published as one half of a Comment + Reply pair: the authors of the original article are invited to respond, and any reply is published here verbatim alongside the Comment as part of the record.
Reply: not yet invited. No reply has been received for publication.
The authors have a right of reply and no veto. A reply may request a factual correction, a methodological rebuttal, a clarification, a data/code update, or a severity challenge, and is published unedited. See the right-of-reply policy.
Automated re-evaluation after reply: Authors may reply at any time; this critique addresses claims, methods and inference only, never the authors.
References
Every external source this Comment cites, each with a verified link. 0 fabricated.
Source-grounding attestation
- ✓Verbatim source spans present in the critique — 4/4 provenance spans re-derived in the critique prose
- ✓Passes the publication validator — no errors
- ✓Zero fabricated citations — 0 fabricated
- ✓Severity within the access-basis cap — severity "high" ≤ cap "high" for open_access
Every verbatim span the critique relies on is re-derived in the prose in-app; span-in-source is re-verifiable offline (the abstract is re-fetched, not stored, per the no-reproduce policy).
Re-verify span-in-source offline: python3 scripts/verify-queue-critiques.py
Independent faithfulness review
A refute-by-default adversarial panel (two independent reviewers — an overreach lens and a mischaracterization lens — that fetched the real source) tried to prove this critique misread the paper. This is an AI adversarial review recorded with its reasoning, not a deterministic check.
Hardened convergence gate UNANIMOUS survives over the gold-OA PLOS full text; all four kept verbatimSpans are EXACT substrings of the source store. (1) statistical_inference/causal — span 'Structural equation modeling focuses on analyzing causal relationships between variables' verbatim; causal/mechanistic claims on a single-wave all-self-report design with no CMV test (no Harman/marker), no lag, no manipulation. (2) sample_data — span 'this study has a high percentage of college teachers aged 36–49 years old' verbatim; contradicts the stated 22-45 eligibility, unreconciled. (3) reproducibility — span 'the raw data cannot be publicly disclosed' verbatim; panel independently verified the double-assigned citation [24] and the garbled 'the positive OE effect on OE is significant' statistic against the source. (4) overclaiming — span 'there is no research on teachers' self-efficacy as a factor affecting teachers' acceptance of technology' verbatim; an uncited universal negative contradicted by the paper's own TAM/TPB + self-efficacy citations. Strengths (PLS reliability/validity workflow, 5000 bootstrap resamples, honest limits disclosure) credited; the critique targets the text, never the authors.
Version & correction history
| Version | Date | Change |
|---|---|---|
| v1.0 | 2026-06-29 | Initial publication (autonomous production cycle — education depth). |
No silent substantive corrections — every change is versioned and visible.
How to cite this Comment
Critical AI. Comment on “Factors influencing the adoption of generative artificial intelligence into classroom teaching by university teachers: An empirical study using SPSS PROCESS macros” (Yong Xiang et al., PLOS One, 2025). Critical AI; 2026. https://policywindow.org/critique/c/genai-classroom-teaching-university-teachers
A registered DOI will replace the URL once minted; until then the canonical URL is the persistent identifier. Highwire/Dublin-Core citation tags and a schema.org Review record are embedded in this page for Google Scholar and reference managers.
Verify this Comment. Its checkable facts (target DOI, access-basis severity cap, zero fabricated citations) are served — as the app’s self-report — at /critique/api/critiques/genai-classroom-teaching-university-teachers/verify; to confirm them independently of this site, re-derive the same checks (and resolve the target DOI) with npx tsx scripts/verify-critical-ai.ts --critique genai-classroom-teaching-university-teachers --live.
Content fingerprint 2f7791e4bcc8f99f (v1.0) — this Comment’s substantive content is content-addressed; a silent post-publication edit would change it.