AI search is being used as a shortcut to find and summarize news, but a study from Columbia Journalism Review’s Tow Center for Digital Journalism found a serious weakness at the center of that experience: attribution. When generative AI search tools were asked to identify the source details of real news excerpts, they often gave users the wrong answer.
The researchers tested eight AI-driven search tools with direct excerpts from real news articles. They asked each model to identify the article’s original headline, publisher, publication date, and URL. Across the test, the tools incorrectly cited sources in more than 60 percent of queries.
What the study tested
The test was focused on a basic task: whether AI search engines could connect a news excerpt to the correct original article. That matters because generative search tools do not just list links. They often answer in a confident, finished format that can make their responses feel authoritative.
Researchers Klaudia Jaźwińska and Aisvarya Chandrasekar reported that roughly 1 in 4 Americans now use AI models as alternatives to traditional search engines. If those tools cannot reliably identify a source when the prompt is built around source attribution, the problem is not limited to a narrow technical edge case.
In total, researchers ran 1,600 queries across the eight different generative search tools. The results varied by platform, but the broader pattern was consistent: AI search systems regularly failed to return correct citation information for news content.
Error rates varied widely
Some tools performed better than others, but none of the reported results remove the larger concern. Perplexity provided incorrect information in 37 percent of tested queries. ChatGPT Search incorrectly identified 67 percent (134 out of 200) of articles queried. Grok 3 had the highest reported error rate, at 94 percent.
The study also found a behavior that is especially difficult for readers to detect. Instead of declining to answer when reliable information was not available, the tools frequently produced plausible but incorrect or speculative responses. The report refers to this kind of behavior as confabulations.
That distinction matters. A search tool that says it cannot verify an answer creates a moment of caution. A tool that presents the wrong headline, publisher, date, or URL as if it were known can send readers away with a false sense of certainty.
Paid tools did not remove the risk
The study also looked at premium versions of AI search tools, and the results were not simply better because the tools were paid. Perplexity Pro ($20/month) and Grok 3’s premium service ($40/month) correctly answered a higher number of prompts, but they also confidently delivered incorrect responses more often than their free counterparts.
The issue was not only whether a model could sometimes find the right answer. It was whether the model knew when to stop. A tool that answers more often can still create a worse experience if it is reluctant to decline uncertain prompts.
For users, that makes AI search reliability harder to judge. A polished response, a paid plan, or a confident tone does not necessarily mean the citation is correct. The study’s findings point to a practical rule: source details from generative search still need verification, especially when the answer concerns news attribution.
Publisher control remains unsettled
The report also raised questions about publisher control. The CJR researchers found evidence suggesting some AI tools ignored Robot Exclusion Protocol settings, a widely accepted voluntary standard publishers use to ask web crawlers not to access specific content.
One example involved Perplexity’s free version. It correctly identified all 10 excerpts from paywalled National Geographic content, even though National Geographic explicitly disallowed Perplexity’s web crawlers.
Citations also did not always send users to the original publisher. In some cases, AI search tools directed users to syndicated versions of content on platforms like Yahoo News rather than the original publisher sites. The source article says this happened even in cases where publishers had formal licensing agreements with AI companies.
Broken or fabricated URLs were another problem. More than half of citations from Google’s Gemini and Grok 3 led to fabricated or broken URLs that produced error pages. Of 200 citations tested from Grok 3, 154 resulted in broken links.
Why this matters for news
For publishers, the findings describe a difficult tradeoff. Blocking AI crawlers might mean losing attribution entirely. Allowing AI tools to access content can mean reuse without sending readers back to the publisher’s own website.
Mark Howard, chief operating officer at Time magazine, told CJR he was concerned about transparency and control over how Time’s content appears in AI-generated searches. He also saw room for improvement, saying, “Today is the worst that the product will ever be,” while pointing to investment and engineering work aimed at improving the tools.
Howard also warned readers not to assume free AI products are fully accurate: “If anybody as a consumer is right now believing that any of these free products are going to be 100 percent accurate, then shame on them.”
OpenAI and Microsoft provided statements to CJR acknowledging receipt of the findings but did not directly address the specific issues. OpenAI noted its promise to support publishers through summaries, quotes, clear links, and attribution. Microsoft stated it follows Robot Exclusion Protocols and publisher directives.
The latest report builds on previous findings published by the Tow Center in November 2024, which identified similar accuracy problems in how ChatGPT handled news-related content. Taken together, the findings show that generative AI search still has a basic news problem: it must become better at saying where information came from, and better at admitting when it does not know.