Dylan Patel delivers a startling assessment of the artificial intelligence hardware landscape: the era of Nvidia's unchallenged dominance may be ending, not through a superior chip, but through a radical shift in how infrastructure is sold and financed. While the industry fixates on microarchitecture wars, Patel argues that the real story is Google's decision to finally commercialize its Tensor Processing Units (TPUs) to external rivals, creating a "circular economy" that forces competitors to save money simply by threatening to switch. This is not just a product launch; it is a structural disruption of the entire AI supply chain that redefines who holds the leverage in the race for intelligence.
The End of the Nvidia Moat
Patel's central thesis challenges the conventional wisdom that Nvidia's CUDA software ecosystem is an impenetrable fortress. He posits that the hardware itself, when paired with Google's system-level engineering, offers such a compelling cost advantage that the software barrier is becoming porous. "The more (TPU) you buy, the more (NVIDIA GPU capex) you save!" Patel writes, highlighting a counterintuitive reality where the mere threat of adopting Google's chips has already driven significant savings for competitors. He points to OpenAI as a prime example, noting that the company "hasn't even deployed TPU yet and already increased perf per TCO by getting ~30% off their compute fleet due to competitive threats."
This argument reframes the competition from a technical race to a financial one. Patel suggests that Nvidia's strategy of propping up cash-burning startups with equity investments is a defensive maneuver to avoid price wars that would erode their massive gross margins. The author writes, "We think a more realistic explanation is that Nvidia aims to protect its dominant position at the foundation labs by offering equity investment rather than cutting prices, which would lower Gross margins and cause widespread investor panic." This is a sharp, cynical read of the market dynamics that goes beyond the typical "innovation vs. competition" narrative.
Critics might note that underestimating the stickiness of CUDA is a dangerous gamble; developers are deeply entrenched in Nvidia's tooling, and switching costs are notoriously high. However, Patel's evidence suggests that when the cost savings are as drastic as 30%, even the most entrenched users will reconsider their architecture.
The hardware infrastructure on which AI software runs has a notably larger impact on Capex and Opex, and subsequently the gross margins, in contrast to earlier generations of software, where developer costs were relatively larger.
The Anthropic Deal and the Neocloud Revolution
The article pivots from theory to a concrete, massive transaction: the deal between Google and Anthropic. This is not a standard cloud rental agreement; it is a multi-billion dollar restructuring of how AI labs acquire power and compute. Patel details a split where Anthropic will deploy 400,000 TPUs in its own facilities, with the remaining 600,000 rented through Google's cloud. The financial scale is staggering, with the rental portion alone estimated at a $42 billion Remaining Performance Obligation (RPO).
Patel identifies a critical bottleneck in Google's expansion: power. While Google controls its silicon, it struggles with the administrative speed of securing datacenter capacity. To bypass this, the administration has pioneered a new financing model involving "Neocloud" providers like Fluidstack. "Instead of leasing directly, Google offers a credit backstop, an off-balance-sheet 'IOU' to step in if Fluidstack cannot pay its datacenter rent," Patel explains. This mechanism allows nimble, crypto-mining-converted datacenters to secure capacity without the decade-long leases that typically stifle growth.
This approach effectively turns the datacenter industry on its head. By leveraging the existing power infrastructure of reformed cryptominers, Google is accelerating its deployment timeline. Patel notes that "the datacenter industry faces acute power constraints power, and cryptominers already control capacity through their PPAs and existing electrical infrastructure." This creates a symbiotic relationship where Google gets speed, and the miners get a pivot to the AI boom.
A large datacenter lease is typically 15+ years, with a typical payback period of ~8 years. This duration mismatch has made it very complicated for both Neoclouds and Datacenter vendors to secure financing for projects.
Systems Matter More Than Silicon
Patel returns to the technical core, arguing that the industry has been fixated on the wrong metric. For years, analysts compared chip specifications in isolation, ignoring how they function within a full system. "We argued then that 'systems matter more than microarchitecture,' and the past two years have reinforced this view," Patel writes. He draws a parallel to the history of vertical integration, noting that while Nvidia is now pushing toward becoming a true systems company with its GB200, Google has been scaling up TPU interconnects within and across racks since 2017.
The proof of this systems advantage is the success of Google's own Gemini 3 model. "Gemini 3 is one of the best models in the world and was trained entirely on TPUs," Patel states, using this as the ultimate validation of the platform. He contrasts this with the struggles of rivals, pointing out that "OpenAI's leading researchers have not completed a successful full-scale pre-training run that was broadly deployed for a new frontier model since GPT-4o in May 2024." This comparison underscores the difficulty of training frontier models at scale and positions Google's infrastructure as the only proven alternative to Nvidia.
However, a counterargument worth considering is that training a model is only half the battle; the inference ecosystem, which powers real-world applications, remains heavily dependent on Nvidia's mature software stack. Patel acknowledges this, identifying the "critical missing ingredient" for Google as the need to open-source its XLA compiler and runtime code to truly break the CUDA moat. Without this software openness, the hardware advantage may not be enough to win the long-term war for developer mindshare.
The TPU platform has passed that test decisively. This stands in sharp contrast to rivals: OpenAI's leading researchers have not completed a successful full-scale pre-training run that was broadly deployed for a new frontier model since GPT-4o in May 2024.
Bottom Line
Patel's analysis is a powerful reminder that in the AI era, infrastructure is not just a utility but a strategic weapon that dictates the economics of the entire industry. The strongest part of his argument is the demonstration of how Google is using financial engineering and system-level optimization to bypass Nvidia's hardware dominance, forcing the market to reprice its expectations. The biggest vulnerability remains the software ecosystem; until Google opens its compiler stack, the CUDA moat may still be too deep to cross for many developers. The reader should watch closely for the next wave of deals between Google and other foundation labs, as these will determine whether the TPU becomes a true merchant standard or a niche alternative.
The hardware infrastructure on which AI software runs has a notably larger impact on Capex and Opex, and subsequently the gross margins, in contrast to earlier generations of software, where developer costs were relatively larger.