Why Alibaba’s free Qwen3.5 raises the stakes for open-weight AI

Alibaba has released Qwen3.5-397B-A17B as a free open-weight model that handles text, images, and video in one architecture. Its strongest gains are in agent tasks, instruction following, and multimodal work, while GPT-5.2, Claude 4.5 Opus, and Gemini-3 Pro still lead on several benchmarks.

WTF Index TERMINATOR
◄ Terminator 3 Idiocracy 1 ►

A free open-weight multimodal model with stronger agentic capabilities points toward more powerful and autonomous AI, though the story is mostly a competitive release update.

Why Alibaba’s free Qwen3.5 raises the stakes for open-weight AI

Alibaba’s Qwen3.5-397B-A17B arrives as a clear signal that the open-weight AI race is still accelerating. The model is free to download, built for text, images, and video, and designed to compete not only on benchmark scores but also on practical agent workflows.

The release matters because it combines several themes now shaping advanced AI: larger total capacity, lower active compute, stronger multimodal handling, and more emphasis on autonomous tasks. It also lands in a market where Chinese AI labs are pushing open availability and low API pricing as central advantages.

A large model that only uses part of itself at once

Qwen3.5-397B-A17B has 397 billion total parameters, but only 17 billion are active for a given query. That design follows a mixture-of-experts architecture, where the model routes work through the most relevant parts of the network rather than activating everything each time.

The source notes that Qwen3.5 has an unusually high ratio of total to active parameters, similar to Qwen3-Next. That suggests a fine-grained split across many specialized experts, which is meant to make the model more efficient while keeping broad capability available.

Alibaba also introduced Gated Delta Networks, a new attention architecture intended to reduce compute costs further. The company’s goal is not simply to make the model bigger, but to make each query cheaper and faster to process.

According to the Qwen team, Qwen3.5 processes requests 19 times faster than Qwen3-Max and 3.5 to 7 times faster than Qwen3-235B with a 256,000-token context window. The team says performance remains comparable despite those speed gains.

Where Qwen3.5 performs best

The biggest improvements appear in agentic work. On TAU2, a benchmark for autonomous agent performance, Qwen3.5 scores 86.7, close to GPT-5.2 at 87.1 and behind Claude 4.5 Opus at 91.6.

For complex instruction following, Qwen3.5 posts field-leading scores on IFBench at 76.5 and MultiChallenge at 67.6. The practical example given is that the model can create a slide deck from a combination of an image and prompts.

Its image-related results are also notable. Alibaba says Qwen3.5 reaches top marks on math-visual benchmarks including MathVision at 88.6 and ZEROBench at 12. It also leads in most document comprehension and text recognition tests.

But the picture is not one-sided. On MMMU, a broader image understanding benchmark, Qwen3.5 scores 85, behind Gemini 3 Pro at 87.2 and GPT-5.2 at 86.7. In classic coding and reasoning, other models still lead in several places.

  • On LiveCodeBench, GPT-5.2 scores 87.7, compared with 83.6 for Qwen3.5.
  • On AIME26, Qwen3.5 reaches 91.3, behind GPT-5.2 at 96.7 and Claude 4.5 Opus at 93.3.
  • On TAU2, Qwen3.5 is just behind GPT-5.2 and Claude 4.5 Opus.

That makes Qwen3.5 a strong contender, especially in agent tasks and multimodal work, but not an outright leader across every category.

Training changes behind the gains

The Qwen team credits the improvement over the previous Qwen3 series to a much larger reinforcement learning phase. Rather than tuning only for individual benchmarks, the team says it expanded the variety and difficulty of training environments.

The strongest payoff from that approach appeared in agent skills. That fits with the model’s benchmark profile, where autonomous tasks and complex instruction following stand out.

Alibaba also says Qwen3.5 was trained on considerably more data than its predecessor, while using stricter filtering. Despite the more efficient architecture, the model matches the performance of Qwen3-Max-Base, which has over one trillion parameters.

Language coverage has expanded as well. Support grew from 119 to 201 languages and dialects, and the vocabulary increased from 150,000 to 250,000 tokens. Alibaba says the larger vocabulary should speed up processing for most languages by 10 to 60 percent.

From video understanding to desktop agents

Qwen3.5 is natively multimodal, meaning it can work across text, images, and video in one architecture. Alibaba says the model can handle up to two hours of video.

In published demos, the company shows the model writing Python code by itself to solve a maze and map the shortest path visually. Another demo has it watch traffic videos and explain driving decisions based on traffic light phases.

The model is also positioned as a GUI agent. In that role, Qwen3.5 can operate smartphone and computer interfaces, including filling out Excel spreadsheets and running multi-step desktop workflows.

For developers, Alibaba connects the model to tools such as Qwen Code, which turns natural language instructions into working code. The hosted version, Qwen3.5-Plus, adds a one-million-token context window through Alibaba Cloud Model Studio and supports web search, code interpreter, and adaptive reasoning.

Alibaba’s stated direction is broader than single-task assistance. The Qwen team says the next step is moving from model scaling to system integration, with future agents gaining persistent memory, improving themselves over time, and accounting for cost constraints. The long-term target is autonomous systems that can handle complex jobs over several days.

Availability, pricing, and the wider race

The open-weight Qwen3.5-397B-A17B is available on Hugging Face under the Apache 2.0 license, which permits commercial use and modification. Developers can also try it in the browser through Qwen Chat in Auto, Thinking, or Fast mode.

For API access, Qwen3.5-Plus is available through Alibaba Cloud Model Studio. The API price is $0.40 per million input tokens and $2.40 per million output tokens.

The release comes as several Chinese AI labs continue to ship large models aimed at coding, agents, and broad benchmark competition. Zhipu AI recently released GLM-5, an open-source model with 744 billion parameters. Moonshot AI introduced Kimi K2.5, which coordinates up to 100 sub-agents running in parallel.

MiniMax launched M2.5, promising “intelligence too cheap to meter.” Baidu took the top position among Chinese models on the LMArena ranking with Ernie 5.0 and its 2.4 trillion parameters. Deepseek’s next large model with a trillion parameters is still delayed, but the source says word is it could ship this week.

The pattern is clear from the source: Chinese AI labs are emphasizing benchmark performance comparable to Western models, open availability, and low API pricing. Qwen3.5 fits directly into that strategy, with Alibaba using efficiency, multimodal capability, and agent performance as its main points of differentiation.