TechCrunch AI December 14, 2024 IDIOCRACY

Death of OpenAI whistleblower puts AI copyright fight in focus

Suchir Balaji, a former OpenAI researcher who publicly questioned the company’s copyright practices, was found dead in his San Francisco apartment. Authorities determined the manner of death to be suicide and said no evidence of foul play was found in the initial investigation.

WTF Index IDIOCRACY

◄ Terminator 0 Idiocracy 1 ►

The story centers on copyright and training-data ethics rather than autonomous AI danger, with only a mild lean toward AI degrading the internet and creative ecosystem.

Death of OpenAI whistleblower puts AI copyright fight in focus

The death of Suchir Balaji, a former OpenAI employee who had recently spoken publicly about copyright concerns in generative AI, has brought renewed attention to the legal and ethical fight around training data.

Balaji, 26, was found dead in his San Francisco apartment. The San Francisco Office of the Chief Medical Examiner identified him and determined the manner of death to be suicide.

What authorities said

According to a statement provided to TechCrunch by a spokesperson for the San Francisco Office of the Chief Medical Examiner, Balaji was identified as Suchir Balaji, 26, of San Francisco.

“The Office of the Chief Medical Examiner (OCME) has identified the decedent as Suchir Balaji, 26, of San Francisco. The manner of death has been determined to be suicide,” said a spokesperson in a statement to TechCrunch. “The OCME has notified the next-of-kin and has no further comment or reports for publication at this time.”

A spokesperson for the San Francisco Police Department told TechCrunch that Balaji was found dead in his Buchanan Street apartment on November 26. Officers and medics had been called to the Lower Haight residence for a wellness check.

Police said no evidence of foul play was found during the initial investigation. Balaji’s death was first reported by the San Jose Mercury News.

Why Balaji had become a public figure in the AI debate

Balaji had worked at OpenAI for nearly four years before leaving the company. In October, he discussed his concerns with The New York Times, saying he believed the technology would bring more harm than good to society.

His central objection involved how OpenAI allegedly used copyrighted data. He believed the company’s practices were damaging to the internet, according to the source article.

In an October tweet, Balaji described how his thinking changed after lawsuits were filed against generative AI companies.

“I was at OpenAI for nearly 4 years and worked on ChatGPT for the last 1.5 of them,” said Balaji in a tweet from October. “I initially didn’t know much about copyright, fair use, etc. but became curious after seeing all the lawsuits filed against GenAI companies. When I tried to understand the issue better, I eventually came to the conclusion that fair use seems like a pretty implausible defense for a lot of generative AI products, for the basic reason that they can create substitutes that compete with the data they’re trained on.”

That argument placed Balaji in a smaller group of former OpenAI employees who had criticized the data foundations of the company’s models. The source notes that several former OpenAI employees had raised concerns about the startup’s safety culture, but Balaji was one of the few who focused on training data.

The copyright lawsuits surrounding OpenAI

OpenAI and Microsoft are involved in several ongoing lawsuits from newspapers and media publishers. Those publishers, including the New York Times, claim the generative AI startup broke copyright law.

Balaji’s public comments became relevant to that legal landscape. On November 25, one day before police found his body, a court filing named him in a copyright lawsuit brought against OpenAI.

As part of a good faith compromise, OpenAI agreed to search Balaji’s custodial file related to the copyright concerns he had recently raised.

In an October blog post, Balaji wrote that he did not believe ChatGPT was a fair use of its training data. He also said similar arguments could apply to many generative AI products.

Balaji’s work before and during OpenAI

Before joining OpenAI, Balaji studied computer science at the University of California, Berkeley. During college, he interned at OpenAI and Scale AI, and later went on to work at OpenAI.

His work at OpenAI covered several major systems and research areas. The source article states that he worked on WebGPT in his early days at the company. WebGPT was a fine-tuned version of GPT-3 that could search the web and was described as an early version of SearchGPT, which OpenAI released earlier this year.

Balaji later worked on the pretraining team for GPT-4, the reasoning team with o1, and post-training for ChatGPT, according to his LinkedIn.

OpenAI and former colleagues respond

OpenAI responded to the news in a statement to TechCrunch.

“We are devastated to learn of this incredibly sad news today and our hearts go out to Suchir’s loved ones during this difficult time,” said an OpenAI spokesperson in an email to TechCrunch.

Several people in the AI community posted public messages mourning Balaji. Ed Newton-Rex wrote that Balaji was kind and thoughtful, and said his decision to speak out about AI and copyright was appreciated by many people. Gary Marcus wrote that he had spoken with Balaji six weeks earlier and said Balaji had left OpenAI and wanted to make the world a better place. Miles Brundage also posted that he was very sad to hear about Suchir.

Balaji’s death does not resolve the copyright questions he raised. It does, however, underline how central those questions have become to the generative AI industry. The legal disputes involving OpenAI, Microsoft, newspapers, and media publishers remain active, while Balaji’s public arguments continue to be part of the broader debate over how AI systems are trained and what their outputs may replace.