The Decoder May 3, 2025 IDIOCRACY

Judge Presses Meta on Fair Use for AI Book Training

A federal judge in San Francisco questioned Meta's argument that training Llama on copyrighted books can be defended as fair use. Judge Vince Chhabria said AI training may be transformative, but warned that the resulting technology could threaten the market for original works.

WTF Index IDIOCRACY

◄ Terminator 0 Idiocracy 1 ►

The story mildly leans toward AI undermining human creative markets and original authorship, but it is mainly a legal fair-use dispute.

Judge Presses Meta on Fair Use for AI Book Training

A federal judge in San Francisco is putting pressure on Meta's fair use defense in a copyright case over AI training. The dispute focuses on whether copyrighted books can be used to train AI models without permission from authors.

The case centers on Meta's Llama model and works including those by Sarah Silverman. Meta says its use of the material falls under "fair use," while the plaintiffs argue that the use is copyright infringement.

What the Court Is Questioning

At the center of the hearing is a basic but far-reaching question: can an AI company use copyrighted books to build a model without first getting approval from the people who wrote those books?

Meta's position is that the training of Llama should be treated as fair use. The plaintiffs take the opposite view. They say using copyrighted books in this way is not a permitted use of their work, but an infringement of their rights.

Judge Vince Chhabria did not reject the idea that AI training can be transformative. According to the source, he acknowledged that using copyrighted data for AI could be considered transformative. But he also made clear that being transformative does not automatically answer the fair use question.

Why Transformation May Not Settle It

The judge's concern appears to be that the analysis cannot stop at how the material is used inside the AI system. The next question is what the trained technology can do once it exists.

That distinction matters because Meta's Llama model is not described as a static archive or a private research tool. It is a model trained on material that includes copyrighted works. The plaintiffs' argument is that this training creates a product tied to their protected writing without permission.

Judge Chhabria's comments suggest that even if the training process changes the form or function of the original material, the effect on authors still matters. In his view, the possibility of transformation does not erase the possibility of market harm.

The Market Harm Concern

The strongest concern raised in the hearing was about competition with the original works themselves. Judge Chhabria pointed to the possibility that AI systems trained on copyrighted books could weaken the market for those books or for similar works by the same authors.

He described the issue in unusually direct terms:

You have companies using copyright-protected material to create a product that is capable of producing an infinite number of competing products. You are dramatically changing, you might even say obliterating, the market for that person's work, and you're saying that you don't even have to pay a license to that person.

The quote captures the central tension in the case. Meta argues that training Llama on the material is fair use. The plaintiffs argue that the same process uses protected books to build a product that can compete with the market for original writing.

For authors, the concern is not only that their books were used. It is also that the resulting AI technology could produce outputs that affect demand for human-created work. The judge's comments focus on that downstream effect, not only on the training step itself.

What This Means for AI Training

The case does not resolve the broader question on its own in the source material provided. But the hearing shows that courts are closely examining the fair use argument that AI companies may raise when training models on copyrighted books.

For Meta, the key issue is whether the use of copyrighted works in Llama training can be defended as fair use. For the plaintiffs, the key issue is whether the use of their writing without permission should be treated as copyright infringement.

The judge's remarks also show why the word "transformative" is not the end of the debate. A model can be built through a process that changes how material is used, while still raising questions about whether that new product damages the market for the original work.

That is why the market question is so important in this dispute. If an AI model trained on copyrighted books can generate large amounts of competing material, the court must consider whether the authors' market is being changed in a way that fair use should not protect.

The Core Question Ahead

The dispute over Meta's Llama model is ultimately about permission, value, and control. The plaintiffs say authors should not have their copyrighted books used in AI training without authorization. Meta says the use is protected by fair use.

Judge Vince Chhabria's comments do not provide a final answer in the source article. They do, however, make clear that the court is skeptical of a simple defense that says AI training is transformative and therefore fair.

The case remains focused on a question that is central to the future of AI training: when copyrighted books help create a model capable of producing competing material, does fair use still apply?