Ars Technica AI February 21, 2025 NEUTRAL

DeepSeek pushes open source AI past model weights

DeepSeek says it will release five open source repos during an “Open Source Week” starting next week. The move could make parts of its production infrastructure more visible, but it is still unclear whether the release will include the training code needed for a fuller open source AI claim.

WTF Index NEUTRAL

◄ Terminator 1 Idiocracy 0 ►

This is mostly a routine open source infrastructure update with only mild implications for accelerating AI capability.

DeepSeek pushes open source AI past model weights

DeepSeek is preparing to share more of the software behind its AI work, moving beyond the open weights release that recently drew attention across the industry.

The Chinese AI firm says it will release five open source repos starting next week. The planned rollout matters because open weights alone do not show every part of how an AI system is built, trained, served, and inspected.

What DeepSeek says it will release

DeepSeek described the plan in a social media post late Thursday, tying the releases to an “Open Source Week.” The company said the daily releases would expose “these humble building blocks in our online service [that] have been documented, deployed and battle-tested in production. As part of the open-source community, we believe that every line shared becomes collective momentum that accelerates the journey.”

An accompanying GitHub page for “DeepSeek Open Infra” gives the effort a broad framing. It says the coming repositories will include “code that moved our tiny moonshot forward” and will share “our small-but-sincere progress with full transparency.”

DeepSeek has not been specific about the exact code it plans to publish. The same GitHub page refers back to a 2024 paper about DeepSeek’s training architecture and software stack, but the source article does not say that the upcoming repos will include the full training code.

Why open weights are only part of the story

DeepSeek’s initial model release already included open weights access. In plain terms, those weights represent the strength of connections between the model’s billions of simulated neurons.

Open weights can be valuable because end users can fine-tune model parameters with additional training data for narrower or more targeted uses. That is one reason open weights releases have become an important part of the AI ecosystem.

Other major models have used this structure too. The source article names Google’s Gemma, Meta’s Llama, and older OpenAI releases like GPT2 as examples of models released under an open weights approach. Such releases often also include open source code for inference-time instructions, meaning the code involved when a model responds to a query.

But open weights are not the same as a full view into how a model was created. They can expose important model parameters while still leaving the training process, data details, and deeper system design partly hidden from outside users and researchers.

The unresolved question is training code

The biggest open question is whether DeepSeek will publish the code used to train the model. The source article says that remains unclear.

That distinction matters because the Open Source Initiative’s formal definition of “Open Source AI” requires training code. The definition was finalized last year after years of study. According to OSI, a truly open AI also must include “sufficiently detailed information about the data used to train the system so that a skilled person can build a substantially equivalent system.”

If a release includes training code, researchers can examine the model at a deeper level. The source article explains that this visibility can help reveal biases or limitations that are tied to the model’s architecture rather than only its parameter weights.

A fuller source release could also make it easier to reproduce a model from scratch. That could include rebuilding it with completely new training data if necessary. Without those pieces, the release may still be useful, but it would not answer every question about how the system was made.

How this compares with other AI companies

DeepSeek’s move also sharpens the contrast with OpenAI. The source article describes OpenAI’s market-leading ChatGPT models as completely proprietary, with inner workings that remain opaque to outside users and researchers.

The contrast is important because the AI industry uses the word “open” in several ways. A company can release weights, share inference code, publish infrastructure code, or provide training code and detailed training data information. Each choice gives users and researchers a different level of visibility.

Other companies and projects are also part of this wider shift. Elon Musk’s xAI released an open source version of Grok 1’s inference-time code last March and recently promised to release an open source version of Grok 2 in the coming weeks. The source article says the recent release of Grok 3 will remain proprietary and only available to X Premium subscribers for the time being.

HuggingFace is another example. Earlier this month, HuggingFace released an open source clone of OpenAI’s proprietary “Deep Research” feature mere hours after it was released. The clone used a closed-weights model at release “just because it worked well,” Hugging Face’s Aymeric Roucher told Ars Technica, but its source code’s “open pipeline” can be switched to any open-weights model as needed.

What to watch next

The immediate test is what appears in DeepSeek’s five open source repos. If the releases focus on infrastructure and production building blocks, they may still help developers understand and reuse parts of DeepSeek’s online service.

If the releases include training code, the implications would be larger. That would move the company closer to the kind of openness described by the Open Source Initiative and give researchers a better path to examine how the model works beneath its weights.

For now, the facts are narrower. DeepSeek says daily releases are coming, the company is presenting them as production-tested building blocks, and it has not yet clearly said whether the training code will be included. That unresolved detail will decide how far this open source AI push really goes.