MIT Tech Review AI January 24, 2025 TERMINATOR

Why DeepSeek R1 changed the AI sanctions debate

DeepSeek R1 has drawn attention because the Chinese AI startup says its open-source reasoning model can match or surpass ChatGPT o1 on key benchmarks while running at a fraction of the cost. The model’s rise also raises a sharper question: whether US export controls on advanced chips are pushing Chinese AI companies toward efficiency, collaboration, and open-source development instead of slowing them down.

WTF Index TERMINATOR

◄ Terminator 2 Idiocracy 0 ►

The story mildly leans Terminator because it highlights frontier reasoning models becoming cheaper, more capable, and harder for export controls to contain.

Why DeepSeek R1 changed the AI sanctions debate

DeepSeek R1 has become one of the most discussed AI releases because it challenges a central assumption about advanced AI: that only the most resource-rich players can compete at the frontier. The Chinese startup DeepSeek says R1 matches or even surpasses OpenAI’s ChatGPT o1 on multiple key benchmarks, while operating at a fraction of the cost.

The bigger story is not only the model’s performance. It is how DeepSeek built a reasoning model under US export controls that limit Chinese access to cutting-edge chips. According to early evidence described in the source article, those restrictions may be encouraging Chinese AI teams to focus harder on efficiency, resource-pooling, and collaboration.

What makes DeepSeek R1 different

DeepSeek R1 is an open-source reasoning model built to handle complex tasks, especially in mathematics and coding. Like ChatGPT o1, it uses a “chain of thought” approach, allowing it to work through questions step by step rather than only producing a direct answer.

Researchers have responded strongly to the release because the model appears to deliver high-level reasoning at lower cost. Hancheng Cao, an assistant professor in information systems at Emory University, described the model as potentially important for people who do not have access to massive resources. “This could be a truly equalizing breakthrough that is great for researchers and developers with limited resources, especially those from the Global South,” he says.

The model has also stood out because of its engineering choices. Dimitris Papailiopoulos, principal researcher at Microsoft’s AI Frontiers research lab, said the striking part was not unnecessary complexity but the opposite. “DeepSeek aimed for accurate answers rather than detailing every logical step, significantly reducing computing time while maintaining a high level of effectiveness,” he says.

DeepSeek has also released six smaller versions of R1 that are small enough to run locally on laptops. The company claims one of those smaller models outperforms OpenAI’s o1-mini on certain benchmarks. Perplexity CEO Aravind Srinivas wrote that DeepSeek had “largely replicated o1-mini and has open sourced it.”

How chip limits shaped the model

Training large language models requires specialized researchers and major computing power. In a recent interview with the Chinese media outlet LatePost, Kai-Fu Lee said only “front-row players” usually build foundation models such as ChatGPT because the work is so resource-intensive.

DeepSeek had to work within a more difficult hardware environment. Zihan Wang, a former DeepSeek employee and current PhD student in computer science at Northwestern University, said the company reworked its training process to reduce the strain on its GPUs. Those chips were a variety released by Nvidia for the Chinese market, with performance capped at half the speed of Nvidia’s top products.

That constraint appears to have affected the company’s technical priorities. DeepSeek found ways to reduce memory usage and speed up calculation without giving up much accuracy. Wang described the company’s attitude plainly: “The team loves turning a hardware challenge into an opportunity for innovation.”

The source article frames this as a wider signal about US export controls. Rather than simply weakening China’s AI capabilities, the limits may be forcing companies to make better use of the hardware they can access. That does not remove the challenge created by sanctions, but it changes how the impact should be understood.

The company behind the release

DeepSeek is based in Hangzhou, China, and was founded in July 2023 by Liang Wenfeng. Liang is an alumnus of Zhejiang University with a background in information and electronic engineering. He also founded the hedge fund High-Flyer in 2015, which incubated DeepSeek.

Like Sam Altman of OpenAI, Liang aims to build artificial general intelligence, or AGI, a form of AI that can match or even beat humans on a range of tasks. DeepSeek is unusual in the Chinese AI market because it has no plans to raise funds. The field is dominated by tech giants such as Alibaba and ByteDance, along with startups backed by deep-pocketed investors.

High-Flyer’s move into AI was connected to the chip restrictions. Before the expected sanctions, Liang acquired a large stockpile of Nvidia A100 chips, which are now banned from export to China. The Chinese media outlet 36Kr estimates that High-Flyer has over 10,000 units in stock, while Dylan Patel, founder of the AI research consultancy SemiAnalysis, estimates that it has at least 50,000.

That stockpile helped make DeepSeek possible. The company used those chips together with lower-power chips to develop its models. Wang said that while working at DeepSeek he had access to abundant computing resources and freedom to experiment, which he called “a luxury that few fresh graduates would get at any company.”

Why efficiency became the strategy

Liang has been direct about the technical gap Chinese AI companies face. In an interview with 36Kr in July 2024, he said chip sanctions are only part of the problem. He also argued that Chinese companies’ AI engineering techniques are often less efficient.

“We [most Chinese companies] have to consume twice the computing power to achieve the same results. Combined with data efficiency gaps, this could mean needing up to four times more computing power. Our goal is to continuously close these gaps,” he said.

DeepSeek’s answer has been to make efficiency central to the research process. Liang remains closely involved in experiments alongside the team. Wang said, “The whole team shares a collaborative culture and dedication to hardcore research.”

This matters because efficiency is not only a cost issue. If compute is limited, then every improvement in memory use, training design, and calculation speed becomes strategically important. In that environment, a company that can do more with less has a real advantage.

Open-source AI as a competitive path

DeepSeek’s release also fits a broader open-source trend among Chinese AI companies. Alibaba Cloud has released over 100 new open-source AI models across 29 languages and multiple uses, including coding and mathematics. Startups including Minimax and 01.AI have also open-sourced their models.

A white paper released last year by the China Academy of Information and Communications Technology said the number of AI large language models worldwide has reached 1,328, with 36% originating in China. That places China second behind the United States as a contributor to AI models.

Thomas Qitong Cao, an assistant professor of technology policy at Tufts University, said young Chinese researchers identify strongly with open-source culture because they benefit from it. Matt Sheehan, an AI researcher at the Carnegie Endowment for International Peace, added that US export controls have pushed Chinese companies into a position where they must use limited computing resources more efficiently.

That pressure may also lead to consolidation. Two weeks ago, Alibaba Cloud announced a partnership with the Beijing-based startup 01.AI, founded by Kai-Fu Lee, to merge research teams and create an “industrial large model laboratory.” Cao said, “It is energy-efficient and natural for some kind of division of labor to emerge in the AI industry.”

DeepSeek R1 is therefore more than a single model release. It is a case study in how constraints can redirect technical work. The model’s success suggests that the next stage of AI competition may depend not only on who has the most powerful chips, but also on who can use limited resources with the most discipline.