Google’s Tensor Processing Units are moving from a mostly internal advantage to a direct challenge in the AI chip market. According to SemiAnalysis, that shift is already changing negotiations for high-end AI computing, including a reported OpenAI discount on Nvidia systems.
Google is taking TPUs beyond its own walls
For years, Google largely kept its Tensor Processing Units, or TPUs, for its own AI work. The source article says that approach is changing with the TPUv7 "Ironwood," which SemiAnalysis describes as part of a more aggressive push to sell Google silicon to third parties.
That matters because Nvidia has been the dominant supplier for the most visible AI computing buildouts. If Google can make TPUs available at large scale, buyers gain another serious option when planning training and inference infrastructure.
Anthropic is presented as the headline customer. SemiAnalysis indicates that Anthropic’s deal involves around one million TPUs, split between direct hardware purchases and cloud rentals through the Google Cloud Platform (GCP). The infrastructure needed for that hardware reportedly consumes more than one gigawatt of power.
Why OpenAI’s reported discount matters
The strongest signal in the report is not only that TPUs are being sold. It is that their existence may already be affecting Nvidia pricing.
SemiAnalysis reports that OpenAI negotiated a roughly 30 percent discount on its Nvidia fleet because it could credibly threaten to switch to TPUs or other alternatives. In other words, Google’s chips do not need to replace every Nvidia system to influence the market. They only need to be believable enough to change the buyer’s leverage.
The analysts Dylan Patel, Myron Xie, and Daniel Nishball summarized the effect with a line aimed at Nvidia’s own familiar sales language: "The more (TPU) you buy, the more (NVIDIA GPU capex) you save."
That point is important for AI labs and cloud customers. In a market where large deployments require enormous spending, a second viable supplier can affect cost even before a full migration happens.
TPUs are being used for top AI models
The report also argues that TPUs should no longer be treated as a lower-tier substitute. Usage data cited in the source says two recently released frontier models, Google’s Gemini 3 Pro and Anthropic’s Claude 4.5 Opus, rely predominantly on Google TPUs and Amazon’s Trainium chips. Gemini 3 was trained entirely on TPUs.
That evidence supports the central market claim: alternatives to Nvidia are not just theoretical. They are already tied to major AI systems.
On paper, TPUv7 "Ironwood" nearly matches Nvidia’s Blackwell generation in theoretical computing power, measured in FLOPs, and memory bandwidth, according to SemiAnalysis. But the more decisive comparison in the report is cost.
For Google itself, total cost of ownership (TCO) per chip is roughly 44 percent lower than a comparable Nvidia GB200 system. For outside customers such as Anthropic, who pay a markup, the analysts’ model says cost per effective compute unit could be 30 to 50 percent lower than Nvidia systems.
Scale is another part of the TPU pitch. Google’s system can connect up to 9,216 chips into one densely networked domain. The source contrasts that with conventional Nvidia systems, which typically cluster just 64 to 72 chips closely together. For massive AI training runs, that kind of networking can make distribution easier for teams that tune their software to the hardware.
Software remains the hard part
Hardware cost and scale are not enough on their own. The source identifies software as the long-running obstacle for broader TPU adoption, especially because Nvidia’s CUDA platform has become the industry standard.
Google is trying to reduce that barrier. SemiAnalysis says the company is working on native support for PyTorch and integration with inference libraries like vLLM. The aim is to make TPUs usable without asking developers to rebuild their entire software stack.
Still, one important piece remains closed. The source says the core TPU software stack, the XLA compiler, is proprietary. SemiAnalysis views that as a missed opportunity because open-sourcing it could have helped wider adoption.
Google is also using financing partnerships to expand deployment. The company is working with "neoclouds" like Fluidstack and crypto miners like TeraWulf. In those arrangements, Google often serves as a financial backstop: if an operator fails, Google guarantees the rental payments. The source says this helps convert existing crypto mining data centers into AI facilities more quickly.
Nvidia’s response could change the economics again
The report does not present Google’s advantage as guaranteed. Nvidia is preparing its next-generation "Vera Rubin" chips, expected in 2026 or 2027. The source says those chips will use aggressive design choices including HBM4 memory and extremely high bandwidths.
Google’s planned TPUv8 response is described as a two-track strategy. One version is being developed with Broadcom under the codename "Sunfish," and another with MediaTek under the codename "Zebrafish." But SemiAnalysis characterizes the designs as conservative and says the project is facing delays. The report also says the architecture avoids the more aggressive use of TSMC’s 2nm process or HBM4 seen in the competition.
The risk for Google is straightforward. If Nvidia executes well on Rubin, the current TPU price advantage could disappear. SemiAnalysis warns that Nvidia’s Rubin systems, specifically the "Kyber Rack," could become more economical than Google’s own TPUv8, even for internal workloads.
That leaves the AI chip market in a newly competitive position. Google has shown that TPUs can be sold, scaled, and used for major AI systems. Nvidia now has to prove that its next generation can preserve its lead not only in performance, but in the cost equation that customers are increasingly using to negotiate.