Microsoft supercomputer claim reshapes NYT AI copyright fight

The New York Times wants to amend its copyright complaint against OpenAI and Microsoft after a Supreme Court ruling changed the standard for contributory infringement. Its revised argument focuses on Microsoft’s bespoke supercomputing system and alleged market harms from ChatGPT and Bing Chat outputs.

WTF Index NEUTRAL
◄ Terminator 1 Idiocracy 1 ►

This is mainly a legal copyright dispute over AI infrastructure and outputs, with only mild concerns about powerful systems and information quality.

Microsoft supercomputer claim reshapes NYT AI copyright fight

The New York Times is trying to sharpen its copyright case against OpenAI and Microsoft, with a revised complaint that puts Microsoft’s supercomputing role closer to the center of the dispute.

In a heavily redacted court filing Thursday, the NYT asked to amend its complaint after a Supreme Court ruling changed the legal standard for contributory infringement. The proposed update alleges that Microsoft did more than provide cloud infrastructure: it claims Microsoft actively encouraged OpenAI to steal NYT works by building a bespoke supercomputing system ranked among the most powerful in the world.

Why the complaint is changing now

The timing matters because the Supreme Court sided with Cox Communications in a case brought by Sony. Sony had tried to argue that Cox contributed to music piracy as an Internet service provider, but that claim failed and set a new standard for contributory infringement.

Under that standard, plaintiffs must prove that parties intentionally acted to induce illegal conduct. The NYT says its proposed amendment is meant to align its claim against Microsoft with that new standard.

Graham James, an NYT spokesperson, said, “Today, we asked the court for permission to file an amended complaint that further strengthens our case, clarifying our claim of contributory infringement against Microsoft based on new law and new evidence uncovered during discovery.”

The NYT also agreed to voluntarily dismiss two claims: contributory copyright infringement and trademark dilution against all defendants. That makes the revised complaint narrower in some places while intensifying the focus on Microsoft’s alleged role in enabling the disputed AI training.

Microsoft rejected that framing. A Microsoft spokesperson told Ars that the amended complaint is “a last-ditch effort by the plaintiff to save its claim from unfavorable precedent set in other recent rulings.”

The supercomputer allegation

In the original complaint, the NYT treated Microsoft’s supercomputing systems more like generic cloud computing services. The proposed amended complaint seeks to draw a sharper distinction, alleging that the system was tailor-made to help OpenAI infringe copyrights.

The NYT argues that Microsoft built the machine for the explicit purpose of training AI on copyrighted works without permission. It also alleges that NYT articles were more heavily weighted because Microsoft and OpenAI wanted models trained on high-quality journalism that could be confidently mimicked in outputs.

According to the NYT, Microsoft’s role was not passive. The newspaper alleges that by building an “unusually complex” machine, Microsoft helped select the works that were infringed and provided a way to seize copyrighted works without permission.

“Microsoft specifically designed it for the purpose of using essentially the whole Internet—curated to disproportionately feature Times Works—to train the most capable LLM in history,” the NYT alleged.

The complaint also connects that alleged training pipeline to Microsoft’s business gains. The NYT alleged that “Microsoft’s deployment of Times-trained LLMs throughout its product line helped boost its market capitalization by a trillion dollars in the past year alone.”

What the NYT says the models did

The NYT became the first major publisher to sue OpenAI in 2023. Its broader claim is that ChatGPT was illegally trained on its articles and that Microsoft and OpenAI used those works to compete with NYT products.

The allegations cover several kinds of harm. The NYT says ChatGPT infringed copyright by outputting articles verbatim, positioned itself as a substitute for a NYT subscription, falsely attributed claims to NYT reporting, and affected Wirecutter writers by summarizing reviews in ways that allegedly reduced clicks on affiliate links.

Discovery appears central to the NYT’s confidence. The source describes outputs shared during discovery, including a huge chunk of users’ ChatGPT sessions, as some of the strongest evidence that OpenAI and Microsoft built tools that allegedly replaced the NYT by producing near-verbatim excerpts of copyrighted works.

The complaint describes two patterns:

  • Some users told ChatGPT they were trying to skirt paywalls and requested the “next paragraph.”
  • In other cases, “models simply spit out several paragraphs” without that kind of workaround.

The NYT also points to hallucinations as a reputational problem. It cited examples including Bing Chat citing fake quotes from Steve Forbes’ daughter Moira Forbes and ChatGPT fabricating an NYT article that was never published but claimed a link between non-Hodgkin’s lymphoma and consuming orange juice.

“Users who ask a search engine what The Times has written on a subject should be provided with neither an unauthorized copy nor an inaccurate forgery of a Times article, but a link to the article itself,” the NYT alleged.

The fair use fight remains unresolved

Microsoft and OpenAI are trying to persuade the court that training AI on NYT articles is fair use. OpenAI spokesperson Drew Pusateri reiterated the company’s often-repeated position that AI training on copyrighted works is indisputably fair use.

The NYT, however, appears to be leaning on evidence of substitution. The source notes that one early verdict finding AI training to be fair use turned on the plaintiffs’ failure to prove market harms. It also notes that Last June, a federal judge laid out what he thought could be a winning argument against AI training on copyrighted works, suggesting that the fair use question is still unsettled.

OpenAI has argued that “ChatGPT is not a substitute for a Times subscription,” partly because “they transformed the material for a different use.” The NYT’s case depends in part on convincing the court that the use is not meaningfully different from the newspaper’s own market.

If the NYT succeeds, the consequences could be severe. The source says the most extreme outcome could require OpenAI and Microsoft to wipe models and start over. The NYT has also asked for permanent injunctive relief to prevent future infringement and extensive damages, arguing that “as a direct result of their conduct, Defendants have wrongfully profited from copyrighted works that they do not own.”