TechCrunch AI December 26, 2024 NEUTRAL

Why DeepSeek V3 Matters for Open AI Models

DeepSeek V3 is a large Chinese AI model released under a permissive license that lets developers download and modify it for most uses, including commercial applications. DeepSeek says it beats several open and closed rivals on internal benchmarks, but the model also shows clear limits on politically sensitive topics.

WTF Index NEUTRAL

◄ Terminator 1 Idiocracy 1 ►

This is mainly a routine open-model release and benchmark story, with only mild concerns about capability diffusion and routine automation of writing or coding tasks.

Why DeepSeek V3 Matters for Open AI Models

DeepSeek V3 has put a Chinese AI lab back in the center of the open AI model race. The model is large, downloadable, and aimed at text-based work such as coding, translation, essay writing, and email drafting from a descriptive prompt.

The important claim is not only that DeepSeek V3 is available under a permissive license. It is that DeepSeek says the model can compete with, and in some benchmark tests outperform, major open and closed AI systems.

A Large Open Challenger

DeepSeek V3 was developed by the AI firm DeepSeek and released on Wednesday. Its license allows developers to download and modify the model for most applications, including commercial ones.

That matters because many of the most capable AI models are not downloadable. Some closed AI systems can only be reached through an API, which gives the model provider more control over access, pricing, and deployment. A downloadable model gives developers more room to experiment, adapt, and build around the system directly.

DeepSeek V3 is built for text-based tasks. The source describes coding, translating, and writing essays and emails as examples of the work it can handle. In practice, that places it in the same broad category as other large language models used by developers, companies, and individual users for language and software tasks.

What DeepSeek Says the Benchmarks Show

DeepSeek's internal benchmark testing puts DeepSeek V3 ahead of both downloadable, openly available models and closed models available only through an API. The strongest claims in the source article focus on coding benchmarks.

On a subset of coding competitions hosted on Codeforces, a platform for programming contests, DeepSeek says its model outperforms Meta's Llama 3.1 405B, OpenAI's GPT-4o, and Alibaba's Qwen 2.5 72B. The model is also described as beating competitors on Aider Polyglot, a test that measures, among other things, whether a model can write new code that works with an existing codebase.

Those claims are notable because coding tests can expose weaknesses that simpler language tasks may not. A model that writes code in isolation is useful, but a model that can fit new code into existing software is closer to the kind of work developers actually need.

The benchmark picture in the source is still framed around DeepSeek's own testing. That is an important limit. The article presents the model as one of the strongest open challengers yet, but it does not describe an independent evaluation replacing those internal claims.

The Scale Behind DeepSeek V3

DeepSeek says DeepSeek V3 was trained on 14.8 trillion tokens. In data science, tokens represent pieces of raw data, and the source notes that 1 million tokens is equal to about 750,000 words.

The model is also very large. DeepSeek V3 has 671 billion parameters, or 685 billion on AI dev platform Hugging Face. Parameters are the internal variables that models use to make predictions or decisions.

For comparison, the source says DeepSeek V3 is around 1.6 times the size of Llama 3.1 405B, which has 405 billion parameters. Parameter count often, though not always, tracks with model capability. Larger models can perform better, but they also tend to require stronger hardware to run well.

That tradeoff is central to understanding DeepSeek V3. The model may be powerful, but the source says an unoptimized version would need a bank of high-end GPUs to answer questions at reasonable speeds. So the fact that it is open does not automatically make it easy for every developer or organization to run.

Cost, Hardware, and the China Context

DeepSeek says it trained DeepSeek V3 in just around two months using a data center of Nvidia H800 GPUs. The source notes that these GPUs were recently restricted by the U.S. Department of Commerce from procurement by Chinese companies.

The company also claims it spent only $5.5 million to train DeepSeek V3. The article describes that as a fraction of the development cost of models like OpenAI's GPT-4.

DeepSeek's broader organization adds another layer to the story. It is backed by High-Flyer Capital Management, a Chinese quantitative hedge fund that uses AI to inform trading decisions. High-Flyer builds its own server clusters for model training, and one of its recent clusters reportedly has 10,000 Nvidia A100 GPUs and cost 1 billion yen (~$138 million).

High-Flyer was founded by Liang Wenfeng, a computer science graduate. The source says High-Flyer aims to achieve "superintelligent" AI through DeepSeek.

The Limits Are Part of the Story

DeepSeek V3 is not presented as an unrestricted system. The source says the model's political views are stilted. When asked about Tiananmen Square, for instance, it will not answer.

That behavior is connected to the regulatory environment around Chinese AI systems. DeepSeek, as a Chinese company, is subject to benchmarking by China's internet regulator to ensure that model responses "embody core socialist values." The source also notes that many Chinese AI systems decline to answer on topics that might raise concern from regulators, including speculation about the Xi Jinping regime.

This makes DeepSeek V3 a complicated release. On one hand, it appears to be one of the most capable open AI models described in the source, with major claims around coding performance, scale, and training cost. On the other hand, openness in model access does not mean openness in every answer the model will provide.

DeepSeek also recently unveiled DeepSeek-R1 in late November, described as an answer to OpenAI's o1 "reasoning" model. In an interview earlier this year, Wenfeng characterized closed-source AI like OpenAI's as a "temporary" moat and said, "[It] hasn't stopped others from catching up."

DeepSeek V3 is the latest evidence behind that argument. If its benchmark claims hold up, it shows that powerful AI development is not limited to closed systems or the companies that control the most visible commercial APIs.