The Decoder January 25, 2025 IDIOCRACY

Indian publishers take OpenAI copyright fight to Delhi

The Federation of Indian Publishers has filed a lawsuit against OpenAI in the Delhi High Court. The case centers on claims that copyrighted books and materials were used to train AI models without permission or licensing fees.

WTF Index IDIOCRACY

◄ Terminator 1 Idiocracy 2 ►

The story mainly concerns AI systems potentially substituting for copyrighted books through detailed summaries, with only mild broader societal risk.

Indian publishers take OpenAI copyright fight to Delhi

Indian publishers have brought a new copyright challenge against OpenAI, adding another front to the global dispute over how AI companies use creative and editorial work. The Federation of Indian Publishers, which represents major publishing houses including Bloomsbury, Penguin Random House, Cambridge University Press, and Pan Macmillan, has filed a lawsuit in the Delhi High Court, Reuters reports.

The case focuses on a question now facing courts in more than one jurisdiction: whether AI systems can be trained on copyrighted books and materials without permission from the people and companies that own those rights.

What the Indian publishers allege

At the center of the lawsuit is the claim that OpenAI used copyrighted books and materials to train its AI models without first getting approval or paying licensing fees. The Federation of Indian Publishers is not only objecting to the alleged training use. It is also asking for concrete remedies.

The publishers want OpenAI to stop accessing their content entirely or begin paying proper licensing fees. If neither of those paths is taken, they want training datasets containing their copyrighted materials to be deleted.

That demand matters because it goes beyond a complaint about past conduct. It asks the court to address whether copyrighted material can remain inside training datasets if the rights holders say it was used without permission.

Why ChatGPT summaries are part of the dispute

The publishers are particularly concerned about ChatGPT's ability to generate detailed book summaries. According to the source article, testing showed that the AI can produce chapter-by-chapter breakdowns of copyrighted works, although it refuses to provide complete text.

For publishers, that distinction may not settle the issue. A full copy of a book and a detailed summary are different outputs, but the concern is about whether users may rely on the AI-generated version instead of buying or reading the original work.

Pranav Gupta, the federation's general secretary, described that concern directly:

"This free tool produces book summaries, extracts, why would people buy books then? This will impact our sales, all members are concerned about this."

That statement captures the commercial risk as publishers see it. Their argument is not only that protected material may have been used in training. It is also that AI-generated summaries and extracts could affect demand for the books themselves.

What the publishers are asking OpenAI to do

The lawsuit presents OpenAI with a set of requested outcomes, each tied to control over copyrighted content. Based on the source article, the publishers are seeking one of three broad results:

OpenAI stops accessing the publishers' content entirely.
OpenAI begins paying proper licensing fees.
Training datasets containing their copyrighted materials are deleted if neither alternative is pursued.

Each option reflects a different way of resolving the same underlying dispute. Stopping access would place a firm boundary around the publishers' materials. Licensing would turn the disputed use into a paid arrangement. Dataset deletion would address the publishers' concern that their content remains embedded in training materials.

The case also shows how the fight over AI training is shifting from broad public debate into specific legal demands. Publishers are not simply asking whether AI companies should respect copyright in principle. They are asking courts to decide what should happen when copyrighted works are allegedly used to build AI systems.

OpenAI's position

OpenAI maintains its innocence. According to the source article, the company argues that its AI systems make fair use of publicly available data.

That defense is now part of a wider legal conflict. The source article notes that this argument will likely be tested in courts across multiple jurisdictions in the coming months.

The phrase "publicly available data" is central to the company’s position as described in the source. But the publishers' complaint shows the opposing view: that availability does not automatically mean permission, and that copyrighted books and materials should not be used without approval or payment.

Part of a broader wave of AI copyright cases

The Indian lawsuit is not happening in isolation. The source article places it within a growing global pushback against AI companies' use of copyrighted works.

Authors, news organizations, and musicians have launched similar legal challenges. Canadian publishers have also recently filed their own case. Together, those actions point to a broader confrontation between content owners and AI companies over training data, licensing, and the economic value of creative work.

For the publishing industry, the Delhi High Court case is about more than one tool or one company. It raises practical questions about whether AI-generated summaries can substitute for books, whether training datasets should include copyrighted materials without permission, and whether licensing should become part of the AI development process.

For OpenAI, the case adds another venue where its fair use argument may be tested. For publishers, it is an attempt to assert control over how their books and materials are accessed, used, and potentially reproduced through AI systems.