Microsoft’s proposal to license books from HarperCollins authors puts a concrete price on a question the AI industry has avoided for years: what is a copyrighted book worth when it is used to train an AI system?
The offer is simple on paper. Microsoft would pay $5,000 per book for AI training rights. HarperCollins authors would receive half of that, or $2,500 per book, according to the publisher.
A new price for AI training rights
The proposed agreement gives Microsoft a three-year training license. Authors can accept or decline, which makes the deal different from the more disputed practice of using copyrighted work without direct permission or payment.
Alice Robb, who covered the story for Bloomberg, received the same offer from HarperCollins for her 2018 book Why We Dream. Her case shows why the decision is not just a legal or technical matter. It is also personal, financial, and uncertain.
Robb had one week to decide. The contract had no precedent and no room for negotiation. That left her weighing a definite payment against unclear future consequences, in a market where no one appears to know the right price for this kind of license.
She ultimately accepted the offer, though she remained unsure whether it was the correct choice. Her uncertainty captures the central tension for many writers: refusing the money may not protect a book from AI training, while accepting it may help set a price that feels too low.
Why authors face a difficult choice
The money matters because many authors do not earn large or stable incomes from writing. The Authors Guild reports that full-time authors’ median annual income is just $20,000. In the UK, professional writers earn a median of £7,000 (about €8,400) yearly.
Against that backdrop, $2,500 per book is not an abstract number. For some authors, it may look like meaningful income, especially for older books that are no longer producing steady royalties.
But the same number can also look small when measured against the possible value of AI systems trained on large collections of human writing. That is the core conflict: a single book may be one input among many, but it is still the result of years of creative labor, editing, research, and publishing infrastructure.
For authors, the decision may include several competing questions:
- Does the payment fairly reflect the value of the book?
- Could accepting the deal help establish a market rate for future AI training licenses?
- Does declining the offer provide any practical protection if the work has already been used?
- Should older backlist titles be treated differently from newer or more commercially active books?
Robb’s own situation sharpens that last point. She noted that her eight-year-old book has already been used to train AI systems without permission anyway - likely by Microsoft or OpenAI. That makes the offer feel less like a clean first negotiation and more like a late attempt to formalize rights after the fact.
Backlist books may shape the market
Brown University economist Emily Oster sees the strategy as deliberate. In her view, Microsoft is trying to establish that the rights to train on books are worth $5,000, and backlist authors are a practical place to begin.
That matters because backlist books may be less likely to bring in large current royalties than recent bestsellers. If authors with older titles accept the offer, the industry may start to treat that price as a reference point, even though the value of books can vary widely.
This is why the HarperCollins proposal is bigger than one publisher or one group of authors. It may become part of the early pricing logic for AI training rights. Once a number enters the market, it can influence later deals, expectations, and negotiations.
For publishers, the offer also shows that copyrighted books are being discussed as licensable training material rather than just raw data. That shift could create a clearer business path, but it could also create pressure on authors to accept standardized terms before the market is mature.
The wider fight over fair use
Microsoft is seeking licenses in this case, but other AI companies claim that "fair use" allows them to train AI on copyrighted works without payment. Their argument is that transforming existing data into new products supersedes copyright law.
Authors, publishers, and artists disagree. That disagreement has already led to multiple lawsuits. The HarperCollins offer sits inside that broader conflict: if licensing becomes more common, it may weaken the idea that permission is unnecessary.
The source article also points to Meta as an example of a more aggressive approach to training data. Court documents revealed that, despite internal warnings, the company deliberately used piracy networks to download copyrighted books for AI training and systematically removed copyright notices.
That contrast is important. Microsoft’s offer does not resolve the legal or ethical debate, but it does suggest a different route: pay for access, define the term, and let authors choose. Whether the payment is fair remains contested, but the act of offering a license changes the conversation.
What this signals for AI and publishing
Microsoft’s and OpenAI’s move toward licensing suggests that the big AI labs may be backing away from their stance that using copyrighted content without permission is legal, and taking a more thoughtful approach. Some AI labs are even buying second-tier video content from YouTube creators.
The important point is not only that money is changing hands. It is that creative work is being treated as something that may require permission, contracts, and compensation when used for AI training.
For authors, the immediate question is whether $2,500 is worth accepting for a three-year training license. For publishers and AI companies, the bigger question is whether this becomes a template for future deals.
The answer is still unclear. What is clear is that AI training rights are moving from a legal abstraction into a marketplace, and Microsoft’s $5,000-per-book offer gives that marketplace one of its first visible price tags.