Ars Technica AI November 21, 2025 TERMINATOR

Why Google says AI demand requires capacity to double fast

Google’s AI infrastructure head Amin Vahdat told employees the company must double serving capacity every six months. The target sits alongside a larger goal to scale “the next 1000x in 4-5 years” while keeping cost and power under control.

WTF Index TERMINATOR

◄ Terminator 2 Idiocracy 1 ►

The story mainly signals rapid scaling of AI infrastructure and capability, with mild concern about power and control implications but no direct harm or autonomy risk.

Why Google says AI demand requires capacity to double fast

Google is telling employees that the next stage of AI competition will be defined not only by models and products, but by the infrastructure needed to run them. According to CNBC, AI infrastructure head Amin Vahdat told staff during an all-hands meeting earlier this month that Google must double its serving capacity every six months to keep pace with demand for artificial intelligence services.

The message points to a central tension in the AI industry. Public discussion includes concern about a potential AI bubble and the risk of overinvestment. Inside companies building major AI services, however, leaders are describing a different pressure: they say they cannot expand compute, networking, storage, and data center capacity fast enough.

The scale Google says it needs

Vahdat, a vice president at Google Cloud, presented slides to employees showing that Google needs to scale “the next 1000x in 4-5 years.” That target is much larger than simply buying more hardware. In the meeting, he described a need to deliver increases in capability, compute, and storage networking while holding the line on key constraints.

“for essentially the same cost and increasingly, the same power, the same energy level,”

That requirement changes the nature of the challenge. If Google has to increase capacity dramatically but cannot let cost and energy use grow at the same pace, then efficiency becomes as important as raw expansion. Vahdat framed the path forward as a matter of technical coordination across systems.

“It won’t be easy but through collaboration and co-design, we’re going to get there.”

The demand behind this push is not fully defined in the source. It is unclear how much comes from users actively seeking AI features and how much comes from Google adding AI capabilities into existing services such as Search, Gmail, and Workspace. Either way, the company is treating the resulting load as a capacity problem it must solve quickly.

Why AI infrastructure has become the race

Vahdat described AI infrastructure as central to competition in the field. According to CNBC’s viewing of the presentation, he told employees:

“The competition in AI infrastructure is the most critical and also the most expensive part of the AI race,”

He also made clear that spending alone is not the full strategy. Google expects to spend heavily, but the stated goal is to build infrastructure that is “more reliable, more performant and more scalable than what’s available anywhere else.” That means the company is measuring itself not just by how much capacity it can buy, but by how well its systems work under rising demand.

This helps explain why data centers, chips, model efficiency, and power use are now closely linked in AI strategy. A popular AI service can be limited by the availability of compute even when users are ready to use it. Capacity, in that sense, becomes a product constraint: if the infrastructure is not there, features cannot reach as many people as the company wants.

Compute constraints are already shaping products

Google CEO Sundar Pichai gave employees a concrete example during the all-hands meeting on November 6. He pointed to Veo, Google’s video generation tool that received an upgrade last month, and said capacity limits affected how widely Google could make it available in the Gemini app.

“When Veo launched, how exciting it was,”

“If we could’ve given it to more people in the Gemini app, I think we would have gotten more users but we just couldn’t because we are at a compute constraint.”

That example shows how infrastructure can limit AI adoption even when a company has a feature it wants to promote. In AI, serving users is not only a question of software release timing. It also depends on whether enough compute is available to run demanding features at the required scale.

Google is not alone in facing these constraints. OpenAI is planning to build six massive data centers across the US through its Stargate partnership project with SoftBank and Oracle, committing over $400 billion in the next three years to reach nearly 7 gigawatts of capacity. The company is also serving 800 million weekly ChatGPT users, with paid subscribers regularly hitting usage limits for features like video synthesis and simulated reasoning models.

The chip bottleneck and Google’s response

One bottleneck for AI demand is the supply of GPUs used to accelerate AI computations. Nvidia said its AI chips are “sold out” as it works to meet demand that grew its data center revenue by $10 billion in a single quarter.

Google’s plan, as outlined in Vahdat’s presentation, includes three main strategies:

building physical infrastructure
developing more efficient AI models
designing custom silicon chips

The custom chip strategy matters because it reduces total reliance on Nvidia hardware. Earlier this month, Google announced the general availability of its seventh-generation Tensor Processing Unit (TPU), called Ironwood. Google claims Ironwood is “nearly 30x more power efficient” than its first Cloud TPU from 2018.

That efficiency claim connects directly to Vahdat’s broader point about cost and energy. If Google needs much more AI serving capacity while trying to maintain the same power or energy level, it needs improvements across both hardware and software. More efficient models can reduce the compute required per task, while custom silicon can improve how efficiently the company runs workloads at scale.

A large bet under bubble concerns

The expansion plans arrive while concern about a potential AI industry bubble remains part of the conversation. The source notes widespread acknowledgment of that risk, including extended remarks by Pichai in a recent BBC interview. Google’s posture suggests the company sees underinvestment as a major threat, even though overcapacity could become costly if demand does not keep increasing as expected.

Pichai told employees that 2026 will be “intense,” citing AI competition and pressure to meet cloud and compute demand. He also addressed employee concerns about a possible AI bubble, saying the topic has been “definitely in the zeitgeist.”

The result is a high-stakes infrastructure buildout. Google is trying to expand quickly enough to support AI services, avoid product limits caused by compute shortages, and do so without letting cost and power rise unchecked. The company’s internal message is clear: in the AI race, capacity is no longer a background issue. It is one of the main contests.