Seven AI agents help Data2Story turn CSV files into articles

Researchers from Oxford and Stanford built Data2Story, a Claude Code skill that turns a CSV file into an interactive online article. Its strongest claim is not that it is always right, but that visible claims, charts, and interactive elements can be traced back to evidence.

WTF Index IDIOCRACY
◄ Terminator 1 Idiocracy 2 ►

The story mildly leans Idiocracy because it automates skilled data-journalism work, though its emphasis on traceable evidence limits the risk to quality and truth.

Seven AI agents help Data2Story turn CSV files into articles

Data journalism can demand weeks of work from a newsroom. Data2Story, built by researchers from Oxford and Stanford, aims to compress much of that process into an automated pipeline while keeping the result open to verification.

The system, formally called "Data Journalist Agent", is a Claude Code skill that turns a CSV file into a full interactive online article. Its output can include research context, statistics, graphics, and links between visible claims and the evidence behind them.

What Data2Story Produces

Data2Story starts with structured data and builds a readable article around it. The source describes the system as a predefined task set that Claude Code loads and runs on command, coordinating several specialized agent roles.

The article it creates is not only a written narrative. It can include charts, interactive elements, research context, and statistics. The key distinction is that each visible statement or asset is meant to be tied to supporting material, such as code, data sources, or external URLs.

The authors demo the system using the 2026 FIFA World Cup schedule, a dataset the source says has received little coverage so far. From that schedule and host city information, Data2Story creates a climate-focused article with an interactive map.

In that example, about four in ten matches are slated for locations that the players' union FIFPRO classifies as extremely high heat risk. The source says humidity, rather than air temperature, is the main driver. The authors also stress that these are typical climate conditions, not a forecast for the actual tournament.

The Inspector Makes Claims Traceable

The most important feature is the "Inspector", a panel that exposes structured evidence for each sentence and asset. Every annotated sentence, chart, and interactive element gets its own index card.

Those cards can show the exact line of code and the data file behind a figure. For claims based on outside information, they can show an external URL. This changes the reader's relationship with the article: a disputed figure is not just something to trust or reject, but something that can be checked.

The researchers say this lets 93 percent of all visible statements be checked for their origin. They also make an important distinction: traceable does not automatically mean correct. It means a reader can inspect where the claim came from.

The source compares that with a 25 percent baseline for human-written articles, partly because journalists rarely publish analysis code. That gap points to a weakness in common newsroom practice and to Data2Story's central advantage: it builds verification into the article interface.

How The Seven-Agent Workflow Works

Behind each Data2Story article is a chain of seven specialized agents, described by the team as a "virtual newsroom". Each role handles a separate part of the editorial process, from research to page construction.

  • The "Detective" runs web searches for context, because a table alone often cannot explain the story.
  • The "Analyst" runs code instead of guessing numbers.
  • The "Editor" decides which findings should shape the narrative.
  • The "Designer" chooses a suitable medium, such as a map for geography or an audio clip for music.
  • The "Programmer" builds the HTML page.
  • The "Auditor" checks the layout for errors.
  • The "Inspector" connects the article back to its sources.

For the World Cup example, the "Detective" links host cities to FIFPRO heat risk ratings and Open-Meteo climate data. That illustrates how the system tries to move beyond a raw CSV file without leaving the evidence trail behind.

The base model is Claude Opus 4.7 running on Claude Code. For images, video, and audio, the system uses OpenRouter models including gpt-5.4-image-2, seedance-2.0, and lyria-3-pro-preview.

How Readers Rated The Output

The researchers tested Data2Story against human-written work by pairing 18 public datasets with matching human-written originals from three distinct sources: The Economist, The Pudding, and TidyTuesday.

They recruited 53 readers to rate both versions across five categories: visual design, narrative rhythm, data transparency, verifiability of claims, and insight gained. Data2Story won all five categories.

Its largest advantage was in transparency, with a +1.49 lead on a seven-point scale. Overall, 74 percent preferred the agent article, 25 percent preferred the human version, and 2 percent called it a draw.

The results were not uniform across every type of article. Data2Story won clearly in data-heavy Economist briefings and TidyTuesday pieces. Against The Pudding reports, where design teams often spend weeks crafting presentation, the result was a statistical tie.

The comparison also shows that Data2Story does not simply reproduce human editorial judgment. When measuring which statements from the human-written article also appear in the agent-generated article, Data2Story covers about half. In the other direction, only 35 percent of the agent's statements are found in the human text.

That means the system adds its own angles, but only partly captures the editorial core of the human articles. In short, it can generate a lot of useful material, but it does not automatically understand which human choices matter most.

Where Human Journalists Still Matter

The researchers identify three areas where human authors remain stronger: editorial perspective, creative design, and dense single graphics.

Editorial perspective is the clearest limitation. The source gives a Repair Cafe report as an example. The human article traces low repair rates to manufacturers of phones, cars, and tractors deliberately blocking access to diagnostic tools and parts. That explanation is grounded in reporting, not just data. Data2Story can show what breaks, but the reason can remain hidden.

Creative design is another gap. A Pudding piece on stand-up comedy turns the full transcript of an Ali Wong show into a user interface, with a circle beside each line sized to the length of the laugh. For the same content, the agent embeds a static YouTube thumbnail.

Dense graphics can also lose force when broken apart. In the source's Economist example, a visualization on the space race combines government and commercial providers, success rates, and annotations in one image. The agent spreads the same data across several charts, causing the main point to get lost.

That is why the authors frame Data2Story as a collaborator, not a replacement. Humans bring reporting and perspective. Agents handle computation, graphics, and machine-verifiable sourcing.

The tool may be most useful for niche datasets that newsrooms cannot cover because of limited capacity. One current limitation is that Data2Story runs on full autopilot. A human-in-the-loop version is left for future work. The site is live at data2story.github.io, and the code is on GitHub.

The larger issue is attribution. The source notes that a recent Peking University benchmark found leading models often provide the right answer in document analysis while citing the wrong sources, a problem the researchers call "attribution hallucination". Another study suggests AI search agents often confirm what they already know from training rather than truly researching.

Data2Story tries to address that problem by making the analyst calculate figures with runnable code and by having the Inspector link every statement to its source. For data journalism, that may be the real future-facing idea: not automated articles alone, but automated articles that show their work.