AI Image Labels Work, Until They Don't: Study Finds Dangerous Loophole

Credit: CISPA

AI ‘Fake’ Labels Help With Some Misinformation, But May Leave People Vulnerable to Photo-Based Lies

In A Nutshell

A study of 1,354 people found that AI image labels reduced belief in fake stories paired with AI-generated photos.
But the labels backfired: people became more likely to believe false stories paired with real, unlabeled photos.
Participants used the presence or absence of a label as a shortcut for deciding whether a story was true, rather than thinking critically about the content.
Mislabeling, where real photos get flagged as AI or fake images go unlabeled, could erode public trust in the entire labeling system.

When lawmakers decided that slapping an “AI-Generated” label on synthetic images would help the public spot misinformation, it sounded like a reasonable fix. A new study suggests the solution has a serious blind spot, and it may actually be making people easier to deceive, not harder.

During the 2024 U.S. presidential election, fabricated AI images spread widely, and a generated image of an explosion near the Pentagon briefly rattled the stock market. As tools like Midjourney and ChatGPT make it easier for anyone to produce convincing fakes, governments have been racing to require disclosure. The European Union now mandates that large platforms and generative-AI providers label AI-generated content. But a team of researchers publishing in the Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems found that labels do something lawmakers didn’t anticipate: they make people more trusting of everything that isn’t labeled, including false stories paired with ordinary, real photographs.

In a controlled survey of 1,354 U.S. and EU participants, researchers showed people simulated social media posts drawn from real fact-checking websites. Participants were randomly assigned to one of three groups: one that saw posts with all AI-generated images properly labeled, one that experienced a “mislabeling” scenario where some tags were wrong or missing, and a control group that saw no labels at all. Topics ranged from politics to celebrity news, and for each post, participants answered whether they believed the claim was true.

AI Labels Made People Trust Real Photos More

On the surface, the labels did part of their job. When AI-generated images were correctly labeled, participants were less likely to believe false claims attached to them. That’s the outcome lawmakers were hoping for.

But the study uncovered two unintended consequences that complicate that good news. First, participants who saw AI labels started using the presence or absence of a label as a mental shortcut for judging a story’s credibility. When a false claim came paired with a real, unlabeled photo, those same participants were more likely to believe it than people who had seen no labels at all. Labels inadvertently gave a free pass to old-fashioned photo-based lies.

Second, when true claims were illustrated with properly labeled AI-generated images, participants were more hesitant to believe them, even when the underlying facts were accurate. A label, regardless of whether the story was true or false, cast a shadow of doubt over the entire post. In other words, the labels weren’t helping people think more carefully about what was true. People were simply using them as a proxy for truth itself.

AI transparency infographic — Slapping ‘AI-Generated’ on fake images sounds like a fix. New research says it may actually make misinformation easier to spread. (Image by StudyFinds)

People Like the Idea, But They Don’t Fully Trust It

Focus group conversations, conducted separately with 18 adults from the U.S. and EU, revealed that people’s relationship with AI labels is tangled up in deeper anxieties about trust and power. Most participants initially warmed to the concept. As one put it: “I think they are great. So you don’t have to question yourself whether something is real or not. Especially if you’re not like very tech-savvy.”

That goodwill came loaded with reservations. A recurring worry was whether labels could be gamed by bad actors who simply strip tags before sharing. Others were uncomfortable with how much authority a labeling system would hand to platforms: “I don’t know if I necessarily trust a platform to do the right thing because I’ve heard of many instances where they’re like, oh we’re gonna try to do the right thing and […] they don’t.” Some participants also raised the question of who gets to decide which images even qualify for a label in the first place.

Mislabeling Could Undermine the Whole System

Very few participants knew that several major platforms already embed hidden technical markers in AI-generated files to track their origin. Most assumed detecting AI images meant running them through a separate checker. That gap in awareness made the concept of mislabeling, when researchers introduced it, land especially hard. One participant’s reaction ended up lending the paper its title: “That’s another doom I haven’t thought about until now. Unraveling the implication on history books or politics. Well, that’s [a] huge mess.”

Many participants treated unlabeled AI-generated images as the bigger danger, though the survey response was mixed: the largest share said both types of mistakes were equally bad. A mislabeled real photo can potentially be verified against other sources. An unlabeled AI image, though, could spread unchecked before anyone caught it, and by then the damage might already be done.

AI labels are not the clean, simple fix that regulations have implied. They reduce one specific type of deception, but appear to create a new vulnerability in the process. As AI image tools become cheaper and more accessible, relying on a disclosure tag alone, without accounting for the human tendency to treat labeled content as the only threat, may end up doing as much harm as good.

Disclaimer: The findings discussed in this article are based on a study using simulated social media posts and may not fully reflect how people behave on real platforms. AI labeling policies and technologies are rapidly evolving.

Paper Notes

Limitations

The study’s authors note several factors to consider when interpreting the results. The survey used simulated social media posts rather than real platform environments, which may not fully capture how people behave on actual sites where they can like, share, or comment. Posts were static, so none of the real-time social dynamics of online platforms were present. Participants came from the U.S. and EU only, so findings may not apply to other regions or cultural contexts. The survey was conducted in English, which may have disadvantaged some EU participants. Additionally, the study focused specifically on images, not video, audio, or text, so conclusions should not be broadly extended to all forms of AI-generated content. The label design used in the survey (simple text reading “AI-Generated” in the top-right corner of an image) represents just one of many possible formats, and different designs might produce different results.

Funding and Disclosures

The research was partially funded by VolkswagenStiftung Niedersächsisches Vorab, the Deutsche Forschungsgemeinschaft under Germany’s Excellence Strategy, the Daimler and Benz Foundation, and the German Federal Ministry of Education and Research. No conflicts of interest were identified in the paper.

Publication Details

Authors: Sandra Höltervennhoff, Jonas Ricker, Maike M. Raphael, Charlotte Schwedes, Rebecca Weil, Asja Fischer, Thorsten Holz, Lea Schönherr, and Sascha Fahl | Institutional Affiliations: CISPA Helmholtz Center for Information Security; Ruhr University Bochum; Leibniz University Hannover; Max Planck Institute for Security and Privacy | Paper Title: “That’s another doom I haven’t thought about”: A User Study on AI Labels as a Safeguard Against Image-Based Misinformation | Published In: Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems (CHI ’26), April 13–17, 2026, Barcelona, Spain | DOI: https://doi.org/10.1145/3772318.3791006