Credit: Paper piper on Shutterstock
Study Finds AI Warning Labels Can Make Fake Health Posts Seem More Credible
In A Nutshell
- A new study found that AI warning labels made false health posts seem more credible to readers, while making accurate science posts seem less trustworthy.
- Researchers call this a “truth-falsity crossover effect,” a pattern that contradicted their own hypotheses.
- People who already distrusted AI were less swayed by the effect, but not immune to it.
- Scientists suggest pairing AI labels with a second verification cue may help correct the problem, though that idea remains untested.
AI labels on social media posts are meant to encourage readers to scrutinize content more carefully. New research suggests they may actually be making health misinformation seem more credible, while making accurate science posts seem less so.
That’s the core finding of a new study published in the Journal of Science Communication, and it should give regulators, platforms, and public health communicators serious pause. Several governments and platforms have introduced AI disclosure rules in recent years. China mandates labeling for AI-generated content, and the EU’s AI Act includes disclosure requirements. Meta has introduced AI labeling on Facebook and Instagram. But what if those labels are quietly working against their own purpose?
Researchers at the University of Chinese Academy of Social Sciences recruited 433 participants to evaluate science-related social media posts, some accurate and some deliberately false, covering food safety and disease prevention. Some posts carried a bold red warning: “Attention: The content was detected as being generated by AI.” Others did not. Participants rated how credible each post seemed to them. The result ran counter to the researchers’ own expectations going in.
How AI Disclosure Labels Are Making Misinformation More Believable
When a false health post carried the AI label, participants rated it as more credible than the same false post without it. Accurate posts showed the reverse: the label made them seem less trustworthy. Researchers described this as a “truth-falsity crossover effect.”
To understand why, consider how most people actually read social media. Nobody is fact-checking claims as they scroll. Instead, readers rely on quick mental shortcuts to decide what feels believable. An AI label appears to function as one of those shortcuts, but not in the way anyone intended.
The authors suggest several possible explanations. One is that the label may tap into a widespread perception that AI is objective, unbiased, and data-driven. Misinformation, typically written to sound authoritative and factual, could benefit from that association. Accurate science posts, which tend to involve qualified claims and layered reasoning, may get hurt by it. A label saying “a machine made this” might inadvertently signal cold, mechanical precision, even when the content is flat-out wrong.
Another possible factor, also proposed by the researchers, involves how people commonly perceive AI: as highly competent but lacking in warmth. For a post that explains and contextualizes, that cold-machine association may undercut credibility. For a post that simply asserts a false fact with confidence, perceived competence could be all the boost it needs. These are interpretive frameworks the authors put forward to explain their findings, not directly measured mechanisms, but they draw on established research in psychology and communication.

Who Is Most Affected by AI Warning Labels?
Not everyone responded the same way. Among readers with stronger negative attitudes toward AI, the credibility boost the label gave to misinformation was weaker in some topic areas, though not eliminated. Those same skeptical readers did penalize accurately labeled posts more heavily. The moderation was real but uneven and topic-dependent, not a reliable corrective.
Reader involvement, meaning how much someone personally cared about the topic, showed limited and inconsistent influence on the label’s effect. The intuition that deeply invested readers would catch false claims more easily does not hold up consistently here.
How the Study Was Conducted
Researchers pulled source material from China’s official Science Rumour Debunking Platform, a government-backed database of expert-reviewed health misinformation. GPT-4 then rewrote selected articles into social media-style posts for China’s Sina Weibo platform, producing accurate and deliberately misleading versions on topics like pesticide residue in produce and eye disease prevention. Eight posts made the final cut: four accurate, four false.
Between March and May 2024, 433 participants rated each post on a reliability scale immediately after reading it. The sample skewed female and college-educated, broadly matching Weibo’s general user base. Age, gender, and education level had no significant effect on how strongly any given participant was swayed by the label.
One important caveat for American readers: some research suggests Chinese users tend to hold more favorable attitudes toward AI than people in many Western countries, shaped in part by a national culture that frames AI as a symbol of progress. That context may have influenced the strength of the effects seen here. Whether the same crossover pattern would appear in the U.S. or Europe remains an open question.
Rethinking How AI Disclosure Labels Should Work
For health communication, this matters because health posts are not simply claims. They are explanations. Readers must trust not just the facts but the reasoning behind them. An AI label may short-circuit that trust in accurate content while lending false authority to misinformation written to sound like settled fact.
The researchers proposed one potential fix worth testing: pairing the AI disclosure with a second label, something along the lines of a caution that the content has not been independently verified. Rather than letting the AI label carry all the interpretive weight, a dual cue might prompt more careful reading across the board. It’s an untested idea, but a logical direction given what the study found.
These findings come from a controlled experiment rather than a live social media environment, so the real-world magnitude of the effect is still unknown. But if they hold in naturalistic settings, the warning label that platforms and governments have bet on as a misinformation fix could carry unintended consequences worth taking seriously.
Paper Notes
Limitations
This study focused exclusively on text-based posts and only two health topics within a Chinese social media context, which limits how broadly the findings apply. The experimental setup removed social cues like likes, reposts, and follower counts to control for outside variables, but that also made the experience less like real-world browsing. The number of test posts was small, and prior knowledge was measured with a single survey question rather than a fuller assessment.
Funding and Disclosures
No external funding was received for this work. ChatGPT-4 was used to generate the experimental materials, which the authors disclosed in their acknowledgments. No conflicts of interest were reported.
Publication Details
Authors: Teng Lin (Ph.D. candidate, School of Journalism and Communication, University of Chinese Academy of Social Sciences, Beijing, China) and Yiqing Zhang (Master’s student, School of Journalism and Communication, University of Chinese Academy of Social Sciences, Beijing, China). | Journal: Journal of Science Communication, Vol. 25, Issue 01 (2026), Article A09. | Paper Title: “Visible sources and invisible risks: exploring the impact of AI disclosure on perceived credibility of AI-generated content.” | DOI: https://doi.org/10.22323/358020260107085703 | Published: March 9, 2026. Received October 6, 2025; Accepted January 7, 2026. Published by SISSA Medialab under a Creative Commons Attribution 4.0 license.







