Huawei is pushing China's AI strategy in a clear direction: build bigger systems when individual chips cannot yet match the most advanced Western alternatives. The company is testing a new AI processor, the Ascend 910D, while also promoting CloudMatrix 384, a rack-scale system built from earlier Ascend 910C chips.
The approach reflects the constraints Huawei faces. The most powerful Nvidia chips are no longer permitted for sale in China due to export restrictions, and Huawei remains affected by US sanctions. Instead of relying only on single-chip gains, Huawei is trying to compete through packaging, interconnects, memory, and system size.
Huawei's next AI chip is still early
Huawei Technologies is currently testing the Ascend 910D, a new AI processor intended to replace Nvidia's more powerful products in the future. According to the Wall Street Journal, Huawei expects to receive the first samples of the 910D in May 2025.
The chip is designed to outperform Nvidia's H100, which has been the industry standard for AI training since 2022. In Western markets, the H100 is now being replaced by successors from the Blackwell generation.
But the Ascend 910D is not presented as a simple one-for-one efficiency victory. Compared with Nvidia's H100, it is less energy efficient and has significantly higher power consumption. Huawei is using new packaging technologies to connect multiple silicon dies and increase performance.
Development remains at an early stage. Comprehensive testing will decide when the chip is ready for the market, so the Ascend 910D is better understood as a strategic direction than as a finished answer.
CloudMatrix 384 shows the system-first plan
In parallel with the Ascend 910D work, Huawei has introduced CloudMatrix 384. The system is still based on the earlier Ascend 910C chip, but it connects 384 of those chips into a rack-scale design.
According to SemiAnalysis, CloudMatrix 384 reaches approximately 300 PFLOPs of BF16 compute performance. That is nearly double Nvidia's GB200 NVL72 system. Nvidia recently introduced its successor, the GB300 NVL72.
The system also has a major memory advantage over Nvidia's comparable offering. SemiAnalysis reports 3.6 times greater aggregate capacity and 2.1 times the memory bandwidth.
The tradeoff is power. CloudMatrix 384 requires about 4.1 times more energy than Nvidia's comparable system, and its energy efficiency per FLOP is 2.5 times lower. In plain terms, Huawei is using more chips, more infrastructure, and more electricity to reach competitive AI infrastructure performance.
Optical links are a defining feature
One of the most notable design choices in CloudMatrix 384 is the interconnect. Huawei has eliminated copper cables entirely and uses a fully optical interconnect.
According to SemiAnalysis, the system uses 6,912 400G transceivers. Each of the 384 GPUs uses seven transceivers for internal scaling.
This is not only a matter of speed. It shows how Huawei is treating the system as the main unit of competition. If individual chips are constrained, the links between chips become more important.
SemiAnalysis notes that the architecture resembles concepts Nvidia previously abandoned due to cost. Analysts consider Huawei to be “one generation” ahead of Nvidia and AMD in this respect.
Sanctions have not removed foreign dependencies
Huawei's progress does not mean it has escaped the global semiconductor supply chain. Despite sweeping US sanctions, the company remains dependent on foreign suppliers for chip manufacturing.
SemiAnalysis reports that the previously shipped Ascend 910B and 910C chips were produced by TSMC in Taiwan. Huawei is said to have arranged for these chips through intermediary firms such as Sophgo. TSMC could face a penalty of up to $1 billion as a result.
High bandwidth memory is another dependency. According to SemiAnalysis, Huawei acquired large volumes of HBM stacks from Samsung, accumulating around 13 million units in storage. Despite export controls, HBM reached China via intermediaries such as Faraday and CoAsia.
China's largest chip manufacturer, SMIC, has expanded its 7-nanometer production capacity, according to SemiAnalysis. Even so, it continues to lag behind leading manufacturers in both yield and technology. Those capacities could potentially increase in the medium term, provided export controls do not become stricter.
The strategic bet is scale over efficiency
The pattern is consistent across Huawei's AI hardware work. The company is prioritizing system-level optimization rather than maximizing individual chip performance.
That means building large, interconnected systems to achieve scale. SemiAnalysis observes that CloudMatrix 384 leverages China's nearly unlimited power supply to provide competitive AI infrastructure, despite high energy consumption.
According to SemiAnalysis, the system delivers 70 percent more FLOPS than Nvidia's current rack, even though its energy efficiency is substantially lower. In China, the additional energy demand is considered acceptable in light of the political priority attached to technological independence.
This makes Huawei's strategy less about matching Nvidia chip for chip and more about changing the level at which the comparison happens. Nvidia remains the benchmark for advanced AI infrastructure, but Huawei is trying to compete by making the full rack, memory pool, and interconnect architecture carry more of the burden.
The Mate 60 smartphone also showed Huawei's continued ability to deploy high-performance chips despite US sanctions. That device used a processor manufactured in China. In AI infrastructure, the same pressure is producing a different response: not just a domestic chip, but a much larger system built to compensate for the limits around each chip.