AI startups test peer review and spark ICLR backlash

At least three AI labs say they used AI to generate studies accepted to ICLR workshops. The dispute centers on consent, unpaid reviewer labor and whether peer-reviewed venues are being used as informal AI benchmarks.

WTF Index IDIOCRACY
◄ Terminator 1 Idiocracy 3 ►

AI-generated research submissions risk degrading peer review quality and exploiting human reviewers as informal evaluation labor.

AI startups test peer review and spark ICLR backlash

A dispute around AI-generated research has put a familiar academic process under new pressure: peer review. At least three AI labs, Sakana, Intology and Autoscience, claim they used AI to generate studies that were accepted to workshops at ICLR, a long-running academic conference focused on AI.

The controversy is not only about whether AI can produce papers that pass through a workshop review process. It is also about who gets to decide when human reviewers become part of an AI experiment, and whether unpaid academic labor is being used to evaluate and promote startup technology.

What happened at ICLR

At conferences like ICLR, workshop organizers typically review studies for publication in the conference's workshop track. That review process became the center of the dispute after Sakana, Intology and Autoscience said AI-generated studies had been accepted to ICLR workshops.

According to an ICLR spokesperson cited by TechCrunch, Sakana told ICLR leaders before submitting its AI-generated papers and obtained consent from peer reviewers. Intology and Autoscience did not, the spokesperson confirmed.

That difference matters because the criticism from academics is focused less on the mere existence of AI-written submissions and more on disclosure and consent. If reviewers do not know that a paper is part of an AI system's test, they cannot choose whether to participate in that test.

Why academics objected

Several AI academics criticized Intology and Autoscience on social media, arguing that their submissions treated peer-reviewed venues as a way to evaluate AI systems. Prithviraj Ammanabrolu, an assistant computer science professor at UC San Diego, wrote in an X post: "All these AI scientist papers are using peer-reviewed venues as their human evals, but no one consented to providing this free labor."

His criticism points to a basic tension. Peer review is already time-consuming, labor-intensive and mostly volunteer work. A recent Nature survey found that 40% of academics spend two to four hours reviewing a single study.

The load is also growing in AI. The number of papers submitted to NeurIPS, the largest AI conference, rose to 17,491 last year, up 41% from 12,345 in 2023. Against that backdrop, reviewers may see undisclosed AI-generated submissions as another demand on a system that is already under strain.

Ashwinee Panda, a postdoctoral fellow at the University of Maryland, said in an X post that submitting AI-generated papers without giving workshop organizers the right to refuse them showed a "lack of respect for human reviewers' time." Panda added that Sakana had asked whether organizers would be willing to participate in its experiment for the ICLR workshop Panda was organizing, and that the answer was no.

The publicity problem

The debate also involves how startups talk about the results afterward. Intology wrote in a post on X that its papers received "unanimously positive reviews" and said reviewers praised one AI-generated study's "clever idea[s]."

For critics, that framing turns peer review into a promotional asset. A workshop acceptance or positive review can be used to suggest that an AI system performed well, even if the people providing that evaluation did not knowingly agree to test the system.

The core concerns are straightforward:

  • Consent: reviewers and workshop organizers may not know they are part of an AI evaluation.
  • Labor: academic peer review is mostly volunteer work and can take hours for a single study.
  • Transparency: an AI-generated submission may be presented publicly as evidence of system quality.
  • Trust: undisclosed experiments can weaken confidence in the review process.

Academia was already dealing with AI-generated copy in research submissions. One analysis found that between 6.5% and 16.9% of papers submitted to AI conferences in 2023 likely contained synthetic text. But the use of peer review as a way to benchmark and advertise AI technology is described as a newer occurrence.

Even supporters face quality questions

The issue is not limited to disclosure. Some researchers are skeptical that AI-generated papers are worth the review effort in the first place.

Sakana itself admitted that its AI made "embarrassing" citation errors. The company also said that only one out of the three AI-generated papers it chose to submit would have met the bar for conference acceptance.

Sakana withdrew its ICLR paper before publication, saying it did so in the interest of transparency and respect for ICLR convention. That step separated its handling of the process from the undisclosed submissions criticized by academics, but it also showed that AI-generated research can still contain flaws serious enough to matter in a review setting.

What this debate suggests next

Alexander Doria, the co-founder of AI startup Pleias, argued that the wave of surreptitious synthetic ICLR submissions points to the need for a "regulated company/public agency" to run "high-quality" AI-generated study evaluations for a price.

Doria's proposed direction is based on compensation and clearer responsibility. In a series of posts on X, Doria said: "Evals [should be] done by researchers fully compensated for their time." He also wrote: "Academia is not there to outsource free [AI] evals."

The dispute around ICLR shows a practical problem for AI research culture. If companies want to test systems that produce academic studies, they need evaluation. But when that evaluation uses peer review, the process involves people, time and institutional trust.

The central question is therefore not whether AI-generated papers can enter academic spaces. It is whether they can do so with disclosure, consent and respect for the labor that makes peer review work.