How two books let AI author mimicry challenge copyright

A study found that fine-tuned AI systems can produce passages in the style of famous authors using limited training material. Readers, including writing experts, often preferred the fine-tuned AI output, which raises questions for copyright law and the market for literary imitation.

How two books let AI author mimicry challenge copyright

A new study puts a sharp point on a growing problem for publishing, AI companies and courts: AI author mimicry may not require a huge archive of an author’s work. According to the research described in the source article, fine-tuning on just two books was enough for AI systems to generate passages that readers preferred over work by professional imitators.

The findings matter because they move the debate beyond whether AI can produce generic prose. The issue is whether targeted training can reproduce a recognizable literary style well enough to compete with human-written imitations and, potentially, with the market around original authors.

What the researchers tested

Researchers at Stony Brook University and Columbia Law School compared human and AI writing in the style of 50 well-known authors. The group included Nobel Prize winner Han Kang and Booker Prize winner Salman Rushdie.

The study involved professional writers and three major AI systems. For in-context prompting, the researchers used GPT-4o, Claude 3.5 Sonnet, and Gemini 1.5 Pro with the same instructions and sample texts. For fine-tuning, only GPT-4o had the required API features, so the team bought digital copies of available books from 30 authors and trained separate models for each writer.

A total of 159 participants judged the passages. That group included 28 writing experts and 131 non-experts from the crowdsourcing platform Prolific. Participants did not know whether the passages they were judging came from a human writer or an AI system.

The evaluation was built around side-by-side comparisons. Readers chose which passage they thought was better. When the task involved style, they also saw an excerpt from the original author, giving them a reference point for judging whether the new passage matched the author’s voice.

Fine-tuning changed the outcome

The results split sharply by method. With basic in-context prompting, experts strongly favored the human-written passages. Non-experts were less decisive, but the expert response showed that standard prompting still left obvious weaknesses for trained readers.

Fine-tuning produced a very different result. After fine-tuning, experts chose the AI-written passages eight times more often for style and twice as often for writing quality. That is the central finding: targeted training made the AI output convincing enough to beat professional human imitation in key comparisons.

The study also tested whether AI detectors could identify the output. Modern AI detectors flagged 97 percent of standard AI outputs as machine-generated. For the fine-tuned outputs, they flagged only three percent. That suggests the same process that improved the writing also made it much harder for detection tools to recognize as AI-generated.

The source article says the researchers found a reason for the improvement. Generic AI outputs are often cliché filled and unnaturally polite. Targeted training mostly fixed those problems, making the writing more persuasive to readers and experts.

Two books were enough

One of the most striking points is that more source material did not appear to make the difference. Authors with just two published books, like Tony Tulathimutte, were mimicked just as well as prolific writers like Haruki Murakami.

That finding changes the practical stakes. If two books can be enough, then targeted author imitation is not limited to writers with large bodies of work. The source article presents this as a challenge to traditional ideas about literary originality because the system does not need to copy lines word for word to compete on style.

The study also found that experts and non-experts reacted differently at first, then converged after fine-tuning. Experts strongly agreed in their negative reviews of basic AI texts, while non-experts were more forgiving. After finetuning, both groups’ ratings lined up, suggesting the improved AI quality convinced even the professionals.

Cost is another pressure point. Training AI on an author’s style cost about $81 per writer. A professional would charge $25,000 for the same amount of text, a 99.7 percent reduction, even if the AI output still needs some editing. That gap explains why the issue is not only technical, but economic.

Why copyright law is now central

The findings arrive while US courts are considering lawsuits over how AI companies acquire and use copyrighted material. In one case against Anthropic, it came out that the company downloaded at least seven million books from illegal sources like LibGen and Pirate Library Mirror, scanned them, and discarded the originals.

The study’s authors say their work could become part of the fair use debate. The central question is whether AI imitations harm the market for original works. If readers prefer AI-written imitations, the researchers argue that could be evidence of market harm.

The US Copyright Office has already warned that AI could push original works out of the market, even when it does not copy them word for word. The study adds a concrete mechanism for that concern: a model trained to imitate a specific author could generate substitute texts that readers find compelling.

The researchers suggest drawing a line between general-purpose AI models and systems trained to imitate specific authors. They argue there is little legal basis for targeted imitation and recommend either banning AI from copying individual authors or requiring clear labels for AI-generated texts.

The bigger signal for publishing

The study does not say every AI imitation is equally strong, and the results depended on how the AI was used. Basic prompting was much less convincing to experts. Fine-tuning was the step that shifted the balance.

That distinction matters for publishers, writers and platforms. A broad AI writing tool and a model tuned to reproduce a specific author’s style may raise different risks. The source article frames this as a reason to treat targeted imitation as its own category rather than folding it into the general debate about AI writing.

For readers, the immediate issue is transparency. If fine-tuned outputs can evade detectors and persuade experts, labels may become more important than technical detection alone. For courts, the question is whether style-based imitation can harm a market even without direct copying.

The study’s core message is simple: AI author mimicry is becoming cheaper, more convincing and harder to spot when models are trained on specific writers. That makes literary style a much more contested part of the AI copyright debate.