The Decoder November 23, 2024 NEUTRAL

Distributed AI training moves across three continents with Intellect-1

Prime Intellect says it has trained Intellect-1, a 10-billion-parameter language model, using computers across the US, Europe, and Asia. The company plans to release both the model and its training data as open source next week, while acknowledging that the system is still closer to a proof of concept than a rival to larger commercial AI models.

Prime Intellect says it has completed training Intellect-1, a 10-billion-parameter language model built with computing resources spread across the US, Europe, and Asia. The company presents the project as an early demonstration that large AI models do not have to be trained only inside the most centralized and heavily resourced labs.

The claim is narrow but important: Prime Intellect says Intellect-1 is the first language model of this size trained through a distributed approach. The company plans to make both Intellect-1 and its training data available as open source next week.

Why distributed AI training matters

Training large AI models typically depends on major computing resources working together reliably. Prime Intellect’s project tries to stretch that idea across geography, using computers located on three continents rather than treating training as something that must happen inside a single tightly controlled cluster.

The company’s broader aim is to show that smaller organizations can participate in building large AI systems. In its framing, distributed AI training could let more people contribute computing power to transparent, freely available AI systems.

That matters because access to compute is one of the practical barriers in AI development. If distributed training can work reliably, it could change who is able to participate in model building, even if it does not immediately produce systems that match the strongest commercial models.

Prime Intellect also connects the project to open-source AI development. The company says open-source development reduces the risks of centralized control, while also acknowledging that competing with major AI labs requires coordinated effort.

How Intellect-1 was trained

Prime Intellect used OpenDiLoCo, its open-source version of DeepMind’s Distributed Low-Communication method, known as DiLoCo. The method is designed to support training across globally distributed systems while minimizing data transfer requirements.

That reduced communication burden is central to the project. A worldwide network of graphics cards cannot behave exactly like a local cluster, so the system has to manage communication efficiently while training continues.

Prime Intellect says it built a reliable distributed training system on top of this foundation. The system is designed to handle computing resources being added or removed on the fly, which is a key requirement if the long-term goal is to let many participants contribute compute.

The model itself is based on the LLaMA-3 architecture and was trained on open datasets. Its training data includes more than 6 trillion tokens, primarily from four sources:

Fineweb-edu
DLCM
Stack v2
OpenWebMath

Those details place Intellect-1 squarely inside the open-model ecosystem. The model architecture, training method, and planned release of the model and data all point toward a project built around transparency and reproducibility.

What Prime Intellect wants to build next

Prime Intellect describes Intellect-1 as a first step rather than the final target. The company wants to scale its distributed training approach to more advanced open-source models.

The next stage is not only about bigger models. Prime Intellect is also building a system that would let anyone contribute computing power securely, with training sessions open to public participation.

That vision depends on more than software. The company says competing with major AI labs will require collaboration and computing resources, and it is seeking support on both fronts.

If the approach works at larger scales, it could give smaller organizations and independent contributors a more direct role in training open AI systems. The source article frames that as the central promise of the project: making AI development more accessible without placing all control in a small number of centralized institutions.

The limits of the breakthrough

Intellect-1 is still relatively small by today’s standards, despite its 10 billion parameters. The source article notes that, even without benchmark results, it is unlikely to match larger commercial AI models or even smaller open-source models.

That distinction matters. The project’s significance is not that Intellect-1 appears to be the strongest model available. Its significance is that Prime Intellect says it has shown a distributed training process can produce a model of this size across computers in the US, Europe, and Asia.

In other words, the technical pathway may be more important than the immediate model output. Intellect-1 is best understood as a proof point for a training method and participation model, not as evidence that distributed open-source training has already caught up with leading AI labs.

The major question is whether Prime Intellect can move from proof of concept to practical impact. Scaling distributed training to more advanced open-source models, managing unreliable or changing compute resources, and coordinating public participation are all part of the challenge described in the source.

For now, Intellect-1 gives the open AI community a concrete experiment to watch. If Prime Intellect follows through on its planned open-source release next week, outside developers will be able to inspect the model and its training data, and the larger test will begin: whether distributed training can become a meaningful force in AI development rather than a one-time demonstration.