← Back to Library

Tpuv7: Google takes a swing at the king

Dylan Patel delivers a startling assessment of the artificial intelligence hardware landscape: the era of Nvidia's unchallenged dominance may be ending, not through a superior chip, but through a radical shift in how infrastructure is sold and financed. While the industry fixates on microarchitecture wars, Patel argues that the real story is Google's decision to finally commercialize its Tensor Processing Units (TPUs) to external rivals, creating a "circular economy" that forces competitors to save money simply by threatening to switch. This is not just a product launch; it is a structural disruption of the entire AI supply chain that redefines who holds the leverage in the race for intelligence.

The End of the Nvidia Moat

Patel's central thesis challenges the conventional wisdom that Nvidia's CUDA software ecosystem is an impenetrable fortress. He posits that the hardware itself, when paired with Google's system-level engineering, offers such a compelling cost advantage that the software barrier is becoming porous. "The more (TPU) you buy, the more (NVIDIA GPU capex) you save!" Patel writes, highlighting a counterintuitive reality where the mere threat of adopting Google's chips has already driven significant savings for competitors. He points to OpenAI as a prime example, noting that the company "hasn't even deployed TPU yet and already increased perf per TCO by getting ~30% off their compute fleet due to competitive threats."

Tpuv7: Google takes a swing at the king

This argument reframes the competition from a technical race to a financial one. Patel suggests that Nvidia's strategy of propping up cash-burning startups with equity investments is a defensive maneuver to avoid price wars that would erode their massive gross margins. The author writes, "We think a more realistic explanation is that Nvidia aims to protect its dominant position at the foundation labs by offering equity investment rather than cutting prices, which would lower Gross margins and cause widespread investor panic." This is a sharp, cynical read of the market dynamics that goes beyond the typical "innovation vs. competition" narrative.

Critics might note that underestimating the stickiness of CUDA is a dangerous gamble; developers are deeply entrenched in Nvidia's tooling, and switching costs are notoriously high. However, Patel's evidence suggests that when the cost savings are as drastic as 30%, even the most entrenched users will reconsider their architecture.

The hardware infrastructure on which AI software runs has a notably larger impact on Capex and Opex, and subsequently the gross margins, in contrast to earlier generations of software, where developer costs were relatively larger.

The Anthropic Deal and the Neocloud Revolution

The article pivots from theory to a concrete, massive transaction: the deal between Google and Anthropic. This is not a standard cloud rental agreement; it is a multi-billion dollar restructuring of how AI labs acquire power and compute. Patel details a split where Anthropic will deploy 400,000 TPUs in its own facilities, with the remaining 600,000 rented through Google's cloud. The financial scale is staggering, with the rental portion alone estimated at a $42 billion Remaining Performance Obligation (RPO).

Patel identifies a critical bottleneck in Google's expansion: power. While Google controls its silicon, it struggles with the administrative speed of securing datacenter capacity. To bypass this, the administration has pioneered a new financing model involving "Neocloud" providers like Fluidstack. "Instead of leasing directly, Google offers a credit backstop, an off-balance-sheet 'IOU' to step in if Fluidstack cannot pay its datacenter rent," Patel explains. This mechanism allows nimble, crypto-mining-converted datacenters to secure capacity without the decade-long leases that typically stifle growth.

This approach effectively turns the datacenter industry on its head. By leveraging the existing power infrastructure of reformed cryptominers, Google is accelerating its deployment timeline. Patel notes that "the datacenter industry faces acute power constraints power, and cryptominers already control capacity through their PPAs and existing electrical infrastructure." This creates a symbiotic relationship where Google gets speed, and the miners get a pivot to the AI boom.

A large datacenter lease is typically 15+ years, with a typical payback period of ~8 years. This duration mismatch has made it very complicated for both Neoclouds and Datacenter vendors to secure financing for projects.

Systems Matter More Than Silicon

Patel returns to the technical core, arguing that the industry has been fixated on the wrong metric. For years, analysts compared chip specifications in isolation, ignoring how they function within a full system. "We argued then that 'systems matter more than microarchitecture,' and the past two years have reinforced this view," Patel writes. He draws a parallel to the history of vertical integration, noting that while Nvidia is now pushing toward becoming a true systems company with its GB200, Google has been scaling up TPU interconnects within and across racks since 2017.

The proof of this systems advantage is the success of Google's own Gemini 3 model. "Gemini 3 is one of the best models in the world and was trained entirely on TPUs," Patel states, using this as the ultimate validation of the platform. He contrasts this with the struggles of rivals, pointing out that "OpenAI's leading researchers have not completed a successful full-scale pre-training run that was broadly deployed for a new frontier model since GPT-4o in May 2024." This comparison underscores the difficulty of training frontier models at scale and positions Google's infrastructure as the only proven alternative to Nvidia.

However, a counterargument worth considering is that training a model is only half the battle; the inference ecosystem, which powers real-world applications, remains heavily dependent on Nvidia's mature software stack. Patel acknowledges this, identifying the "critical missing ingredient" for Google as the need to open-source its XLA compiler and runtime code to truly break the CUDA moat. Without this software openness, the hardware advantage may not be enough to win the long-term war for developer mindshare.

The TPU platform has passed that test decisively. This stands in sharp contrast to rivals: OpenAI's leading researchers have not completed a successful full-scale pre-training run that was broadly deployed for a new frontier model since GPT-4o in May 2024.

Bottom Line

Patel's analysis is a powerful reminder that in the AI era, infrastructure is not just a utility but a strategic weapon that dictates the economics of the entire industry. The strongest part of his argument is the demonstration of how Google is using financial engineering and system-level optimization to bypass Nvidia's hardware dominance, forcing the market to reprice its expectations. The biggest vulnerability remains the software ecosystem; until Google opens its compiler stack, the CUDA moat may still be too deep to cross for many developers. The reader should watch closely for the next wave of deals between Google and other foundation labs, as these will determine whether the TPU becomes a true merchant standard or a niche alternative.

The hardware infrastructure on which AI software runs has a notably larger impact on Capex and Opex, and subsequently the gross margins, in contrast to earlier generations of software, where developer costs were relatively larger.

Deep Dives

Explore these related deep dives:

  • Tensor Processing Unit

    The article centers on Google's TPU chips competing with Nvidia, but readers may not understand the technical architecture that makes TPUs specialized for AI workloads versus general-purpose GPUs

  • CUDA

    The article mentions Nvidia's 'CUDA moat' as a key competitive advantage that Google needs to overcome - understanding CUDA's parallel computing platform explains why software lock-in is so powerful in the GPU market

  • Vertical integration

    The article describes Google's strategy of controlling the full stack from silicon to software, and the 'circular economy' criticism of Nvidia funding startups - this business strategy concept provides crucial context for understanding the competitive dynamics

Sources

Tpuv7: Google takes a swing at the king

by Dylan Patel · SemiAnalysis · Read full article

The two best models in the world, Anthropic’s Claude 4.5 Opus and Google’s Gemini 3 have the majority of their training and inference infrastructure on Google’s TPUs and Amazon’s Trainium. Now Google is selling TPUs physically to multiple firms. Is this the end of Nvidia’s dominance?

The dawn of the AI era is here, and it is crucial to understand that the cost structure of AI-driven software deviates considerably from traditional software. Chip microarchitecture and system architecture play a vital role in the development and scalability of these innovative new forms of software. The hardware infrastructure on which AI software runs has a notably larger impact on Capex and Opex, and subsequently the gross margins, in contrast to earlier generations of software, where developer costs were relatively larger. Consequently, it is even more crucial to devote considerable attention to optimizing your AI infrastructure to be able to deploy AI software. Firms that have an advantage in infrastructure will also have an advantage in the ability to deploy and scale applications with AI.

Google had peddled the idea of building AI-specific infrastructure as far back as 2006, but the problem came to a boiling point in 2013. They realized they needed to double the number of datacenters they had if they wanted to deploy AI at any scale. As such, they started laying the groundwork for their TPU chips which were put into production in 2016. It’s interesting to compare this to Amazon, who in the same year, realized they needed to build custom silicon too. In 2013, they started the Nitro Program, which was focused on developing silicon to optimize general-purpose CPU computing and storage. Two very different companies optimized their efforts for infrastructure for different eras of computing and software paradigms.

We’ve long believed that the TPU is among the world’s best systems for AI training and inference, neck and neck with king of the jungle Nvidia. 2.5 years ago we wrote about TPU supremacy, and this thesis has proven to be very correct.

TPU’s results speak for themselves: Gemini 3 is one of the best models in the world and was trained entirely on TPUs. In this report, we will talk about the huge changes in Google’s strategy to properly commercialize the TPU for external customers, becoming the newest and most threatening merchant silicon challenger to Nvidia.

We plan to:

(Re-)Educate our clients and new readers about the ...