Moonshot AI has introduced Kimi K2.5, a multimodal language model that the company presents as the most powerful open-weight model available. The release builds on Kimi K2, which launched in July, and shifts the focus toward large-scale agent coordination, visual reasoning and coding.
The headline feature is Agent Swarm, a beta system designed to let one model divide a complex task across many AI agents at once. That makes Kimi K2.5 less just a single chatbot and more an orchestrator for parallel work.
Agent Swarm changes the shape of the task
Agent Swarm is built around a simple idea: some problems can move faster when many agents work at the same time. Moonshot AI says Kimi K2.5 can independently coordinate up to 100 sub-agents on one task, with those agents executing up to 1,500 tool calls.
According to the company, this setup can reduce execution time by up to 4.5x compared with a single agent. The practical claim is not just that the model can answer a prompt, but that it can organize work, assign roles and combine results.
Moonshot AI describes sub-agents taking on specialized roles such as "AI researcher," "physics researcher," or "fact-checker." In that arrangement, the main model acts as the coordinator while the sub-agents handle smaller parts of the job.
The company demonstrated this with a task involving the top three YouTube creators in 100 different niches. K2.5 created 100 sub-agents, had them research in parallel and then assembled the output into a structured table.
PARL trains the orchestrator to stay parallel
To train this behavior, Moonshot AI developed a method called Parallel-Agent Reinforcement Learning, or PARL. The goal is to teach a trainable orchestrator agent how to split a larger task into parts that can be handled at the same time.
That training focus matters because parallel systems can fall back into a slower pattern. Moonshot AI calls this problem "Serial Collapse," where the orchestrator returns to step-by-step execution even when it has parallel capacity available.
PARL addresses that by using staged rewards. Early in training, the system is pushed toward parallelism. Later, the emphasis shifts toward task quality, so the model is not merely distributing work for its own sake.
For users, the important implication is that Kimi K2.5 is being positioned around workflow design as much as raw answer generation. The model is meant to decide what can be separated, what should run at once and how to bring the pieces back together.
Open weights with a large multimodal architecture
Kimi K2.5 is a Mixture-of-Experts model with one trillion total parameters and 32 billion active per token. It has 384 experts, with eight selected per token, and uses MoonViT with 400 million parameters as its vision encoder.
The model was further trained on roughly 15 trillion tokens. Moonshot AI says this training helps make K2.5 especially strong when building visually appealing frontend designs.
That frontend emphasis is one of the clearest product angles in the release. Moonshot AI says the model can create complete user interfaces from simple text descriptions, including interactive layouts and animations.
Kimi K2.5 can also reason over images and videos, then use that understanding to generate code. The company shows examples where the model reconstructs a website from a video and calculates the shortest path through a maze image before marking it.
Benchmarks show strengths and limits
Moonshot AI published benchmarks in which Kimi K2.5 leads in some areas and trails in others. The strongest claims appear around agentic work, where the model performs well against several named competitors.
On BrowseComp, Kimi K2.5 reaches 74.9 percent. GPT-5.2 reaches 65.8 percent, while Gemini 3 Pro reaches 59.2 percent. On DeepSearchQA, K2.5 scores 77.1 percent, ahead of Claude 4.5 Opus at 76.1 percent.
The software engineering picture is more mixed. On SWE-Bench Verified, Kimi K2.5 scores 76.8 percent. GPT-5.2 reaches 80 percent, and Claude 4.5 Opus reaches 80.9 percent.
On multilingual SWE-Bench tests, Claude 4.5 Opus leads with 77.5 percent, followed by K2.5 at 73 percent. That puts Kimi K2.5 close to leading systems in some coding evaluations, but not always ahead.
For image and video tasks, K2.5 remains competitive. On MMMU Pro, it reaches 78.5 percent, just behind Gemini 3 Pro at 81 percent. On VideoMMMU, it scores 86.6 percent, slightly ahead of GPT-5.2 and just behind Gemini 3 Pro.
Where Kimi K2.5 is available
Kimi K2.5 is available through Kimi.com, the Kimi app and an API. The weights are also available for download on Hugging Face, which is central to Moonshot AI’s open-weight positioning.
Agent Swarm is currently in beta and available to paying users with free credits. Moonshot AI lists four modes for the model:
- K2.5 Instant
- K2.5 Thinking
- K2.5 Agent
- K2.5 Agent Swarm
Moonshot AI was founded in 2023 and has become one of China’s leading language model providers through the Kimi model family. The company competes with US providers like OpenAI and Anthropic, as well as Chinese rivals like DeepSeek and its V3.2 model.
The Kimi K2.5 release shows where that competition is moving. The next front is not only larger models or better single answers. It is the ability to coordinate many AI workers, use visual input, call tools at scale and turn complex prompts into structured outcomes.