WIRED AI August 21, 2024 NEUTRAL

AI Scientist Pushes Machine Learning Toward Self-Run Research

A University of British Columbia lab has built an AI scientist that can propose machine learning experiments, choose promising ideas with help from an LLM, then write and run code. Its early papers are not breakthroughs, but the project points toward open-ended AI systems that learn by exploring new ideas.

A research system from a University of British Columbia lab is testing a provocative idea: what happens when an AI program does more than answer questions, and starts proposing, coding, and running experiments of its own?

The work, developed at the UBC lab with researchers from the University of Oxford and Sakana AI, has produced papers on machine learning techniques. The results are modest. The larger question is not whether the first batch is brilliant, but whether an AI scientist can become a useful engine for open-ended discovery.

What the AI scientist actually does

The system is designed to generate machine learning experiments, judge which ones seem worth pursuing with help from a large language model, then write and run the code needed to test them. After that, it repeats the process.

The papers it produced focus on technical improvements rather than headline-grabbing discoveries. Several describe changes meant to improve diffusion modeling, an image-generating technique. Another explores a way to speed up learning in deep neural networks.

Jeff Clune, the professor who leads the UBC lab, is direct about the current level of originality.

“These are not breakthrough ideas. They’re not wildly creative,”

he says.

“But they seem like pretty cool ideas that somebody might try.”

That framing matters. The project is not being presented as proof that AI has already become an independent scientific genius. It is an early demonstration of a loop that could matter: generate ideas, evaluate them, run experiments, and build from the results.

Why open-ended learning matters

Current AI systems depend heavily on human-created training data. They can be powerful, but their abilities are shaped by what people have already written, labeled, coded, or demonstrated.

Open-ended learning aims at something different. Instead of only absorbing existing examples, an AI system tries ideas, preserves the ones that seem interesting, and iterates. If that approach becomes more capable, AI programs may learn in ways that are not limited to direct imitation of human-generated material.

Clune’s lab had already explored this direction before the AI scientist. One earlier program, Omni, tried to create behavior for virtual characters across video-game-like environments. It kept behaviors that seemed interesting and then developed new designs from them.

Those earlier systems needed hand-coded instructions to define what counted as interesting. Large language models change that setup because they can mimic human reasoning well enough to help rank or select intriguing possibilities. Another recent project from Clune’s lab used this approach to let AI programs create code for virtual characters in a Roblox-like world.

For Clune, LLMs have opened a broad space of experiments.

“It feels like exploring a new continent or a new planet,”

he says.

“We don't know what we're going to discover, but everywhere we turn, there's something new.”

The limits are still obvious

The strongest case for the AI scientist is also the reason to be cautious. A system that can invent and run experiments sounds powerful, but the first results are derivative and the components are not yet dependable.

Tom Hope, an assistant professor at the Hebrew University of Jerusalem and a research scientist at the Allen Institute for AI (AI2), says the system resembles LLMs in that it appears highly derivative and cannot be treated as reliable. His assessment is blunt:

“None of the components are trustworthy right now,”

he says.

Hope also places the project in a longer history. Attempts to automate parts of scientific discovery go back decades, including work by AI pioneers Allen Newell and Herbert Simon in the 1970s, and later work by Pat Langley at the Institute for the Study of Learning and Expertise.

More recently, other research groups, including a team at AI2, have used LLMs to help generate hypotheses, write papers, and review research. Hope says the UBC team has captured the current moment around this idea.

“They captured the zeitgeist,”

he says.

“The direction is, of course, incredibly valuable, potentially.”

The unresolved issue is whether systems built around LLMs can ever produce genuinely novel or breakthrough ideas. Clune calls that

“That’s the trillion-dollar question,”

and the source makes clear that the answer is still unknown.

From AI research to AI agents

Even if AI scientist systems do not immediately deliver major discoveries, open-ended learning could still become important for making AI programs more useful in everyday computer tasks.

A report posted this month by Air Street Capital highlights Clune’s work as relevant to the development of stronger and more reliable AI agents. In this context, agents are programs that autonomously perform useful tasks on computers. The source notes that big AI companies appear to view agents as the next major direction.

This week, Clune’s lab revealed another open-ended learning project: an AI program that invents and builds AI agents. The agents designed by AI outperform human-designed agents in some tasks, including math and reading comprehension.

That progress introduces a safety problem. The next step is finding ways to prevent such a system from creating agents that misbehave. Clune acknowledges the stakes:

“It's potentially dangerous,”

he says.

“We need to get it right, but I think it's possible.”

What to watch next

The AI scientist is not a finished scientific collaborator. Its ideas are not described as breakthroughs, and critics are clear that its parts are not yet reliable. But it shows a path that researchers are now actively exploring.

The key shift is from AI as a tool that responds to human prompts toward AI as a system that can search through possible experiments. That search may remain incremental. It may also become more capable as more computer power is applied, as Clune suggests.

For now, the significance is practical rather than magical. The UBC project shows how an AI scientist can connect idea generation, experiment selection, code writing, and execution into one cycle. Whether that cycle eventually produces deeper scientific advances remains open.