MIT Tech Review AI July 12, 2024 NEUTRAL

How GPT-4 helps writers while narrowing creative range

New research in Science Advances suggests GPT-4 can improve short-story writing for people who score lower on a creativity task. But the same AI assistance appears to make stories more similar to one another, raising questions about creativity at scale.

Generative AI can make creative work faster and easier, but new research suggests the benefit is uneven. In a study of short-story writing with GPT-4, AI support helped some writers produce stronger work while also making the overall set of stories less varied.

What the study tested

The research, published in Science Advances, examined how people used OpenAI’s large language model GPT-4 to write short stories. The goal was not simply to ask whether AI can produce text that appears creative. The researchers wanted to see whether access to the model changed human creativity.

To do that, the study used two measures: novelty and usefulness. Novelty referred to how original a story was. Usefulness reflected whether the story could potentially be developed into a book or another publishable work.

The researchers first recruited 293 people through Prolific. Each participant completed a task meant to measure inherent creativity: they had to provide 10 words that were as different from one another as possible.

After that, participants were asked to write an eight-sentence story for young adults. They wrote on one of three topics: an adventure in the jungle, on open seas, or on a different planet.

How GPT-4 was used

The participants were randomly divided into three groups. One group had to write only from their own ideas. A second group could receive a single story idea from GPT-4. A third group could receive up to five story ideas from the model.

Among the participants who had the option to use AI assistance, 88.4% chose to do so. That high uptake matters because it shows that most people offered help from the model were willing to bring it into the creative process.

After writing, participants rated how creative they thought their own stories were. A separate group of 600 recruits then reviewed the stories. Each reviewer saw six stories and evaluated them for stylistic characteristics, novelty, and usefulness.

The strongest evaluated creativity appeared among writers who had the greatest access to GPT-4. But that result came with an important qualification: the largest gains were seen among writers who had scored lower on the initial creativity task.

The boost was not equal

The study found a leveling effect. Writers who appeared less creative at the start benefited most from GPT-4 ideas. Writers who were already creative did not see the same kind of improvement in their story quality.

Anil Doshi, an assistant professor at the UCL School of Management in the UK and a coauthor of the paper, described the result directly:

“We see this leveling effect where the least creative writers get the biggest benefit,”

He added that the study did not show the same benefit for people who were already inherently creative. That distinction is central to the findings. GPT-4 did not act as a universal creativity amplifier. It helped most where writers had more room to gain.

Tuhin Chakrabarty, a computer science researcher at Columbia University who specializes in AI and creativity but was not involved in the study, said the findings make sense. People who are already creative do not need AI in the same way to be creative.

Why similar stories are a problem

The research also found a trade-off. Stories written with AI assistance were more similar to one another than stories produced entirely by humans.

That matters because creativity is not only about making one person’s output better. It is also about the range of ideas produced across a group. If many writers lean on the same model for ideas, the individual story may improve while the collective body of work becomes less distinctive.

Chakrabarty noted that AI-generated stories showed similarity in semantics and content. He also pointed to recognizable patterns in AI-generated writing, including very long, exposition-heavy sentences and the use of many stereotypes.

In his view, these traits can weaken overall creativity. He summarized the issue this way:

“Good writing is all about showing, not telling. AI is always telling.”

The study’s broader concern is that AI models draw from the data they were trained on. In the short-story task, that meant AI-assisted stories were less distinctive than ideas created entirely by human participants.

What this means for creative work

The findings point to a practical tension for writers, publishers, and anyone using generative AI for creative production. GPT-4 can be useful, especially when a person is stuck or has fewer initial ideas. It can provide a starting point and help lift weaker drafts.

But if many people use the same kind of system in the same way, creative work may begin to converge. The source of assistance becomes shared, and the resulting stories can begin to resemble one another more than work produced without that assistance.

For the publishing industry, the study raises a clear possibility. If generative AI were widely embraced, books could become more homogenous because they would be produced with models trained on the same corpus.

That does not mean AI has no place in creative work. The research suggests a more specific conclusion: AI can help individuals, but it may reduce variety when viewed across the whole group.

Oliver Hauser, a professor at the University of Exeter Business School and another coauthor of the study, said the point is to understand both the strengths and limits of the technology. As he put it:

“Just because technology can be transformative, it doesn’t mean it will be,”

The lesson is not that generative AI makes creativity disappear. It is that creative gains depend on who is using it, how they use it, and whether the broader result becomes richer or more uniform.