← Back to Library

How much compute does China have?

"Jordan Schneider's latest analysis cuts through the fog of speculation to offer a startlingly concrete answer to a question that has plagued policymakers: exactly how much computing power does China actually possess? By shifting the lens from counting smuggled chips to measuring the workloads they must support, Schneider arrives at a figure that not only aligns with supply-side estimates but reveals the staggering scale of China's hidden AI infrastructure. This is not just a math exercise; it is a strategic reality check for anyone trying to understand the true trajectory of the global AI arms race.

The Demand-Side Pivot

Most discussions on this topic focus on what the executive branch can block at the border. Schneider, however, argues that export controls are only as effective as our understanding of what those chips actually enable. He writes, "If policymakers are debating whether to tighten restrictions on H200s or close cloud compute loopholes, they should probably have a concrete sense of what China's AI ecosystem actually demands." This reframing is crucial because it moves the debate from theoretical restrictions to empirical necessities. By tallying the compute required for everything from chatbots to surveillance, Schneider bypasses the opacity of Chinese state reporting.

How much compute does China have?

The core of his argument rests on a "back-of-the-envelope calculation" (BOTEC) that triangulates demand across different sectors. He estimates that China's AI infrastructure requires roughly 237,000 high-end graphics processing units running continuously just to serve inference workloads. When adjusted for the reality that chips are rarely at 100% capacity, this implies a minimum installed base of over 430,000 units. Schneider notes, "The number I landed on is ~2.8 million H100e, which is nearly identical to Aqib's estimate — though it's entirely possible we're both wrong in ways that happen to equalize." The convergence of two independent methodologies—one counting hardware, the other counting workloads—lends significant weight to the conclusion that China's capabilities are far more robust than official narratives suggest.

Critics might argue that relying on "vibes-based guessing" regarding utilization rates introduces too much uncertainty. Schneider admits this vulnerability, noting that a single percentage point shift in his utilization assumptions can swing the final number by nearly 200,000 chips. Yet, the sheer magnitude of the estimate suggests that even significant margins of error do not diminish the strategic threat.

At the pessimistic end of the utilization range (10%), that number rises to 5.6 million; at the optimistic end (30%), it falls to 1.9 million. The sensitivity is dire, and it's why utilization assumptions deserve more dedicated research.

The Hidden Engine: Enterprise and Surveillance

Where Schneider's analysis becomes particularly insightful is in its breakdown of who is using these chips. The prevailing assumption is that consumer apps drive the demand, but Schneider finds that "casual users barely register, even at an enormous scale." He points out that while hundreds of millions of people may have touched an AI feature, the real computational heavy lifting comes from a much smaller, high-volume cohort. "A single enterprise customer burning through a trillion tokens a year contributes more compute than tens of millions of casual daily users combined," he writes. This distinction is vital for understanding the resilience of China's AI sector; it is not dependent on viral consumer trends but is deeply embedded in the B2B ecosystem.

He estimates that domestic enterprise API usage accounts for about 21% of total inference demand, though he acknowledges this sector is still maturing compared to the United States. More concerning for Western observers is the "miscellaneous category," which includes government surveillance and military-adjacent applications. Schneider estimates this sector requires around 45,000 H100-equivalent units, a figure he admits is the "least confident in" but likely the most critical. "Real-time video reidentification across 600-700 million cameras is computationally intensive," he explains, noting that while edge devices handle initial detection, the backend aggregation demands datacenter-class hardware. This connects to broader historical trends in surveillance; just as rate-monotonic scheduling was developed to ensure real-time responsiveness in critical systems decades ago, today's surveillance networks rely on massive, continuous compute to function without lag, creating a non-negotiable demand for high-end silicon.

The Training Bottleneck and Future Trajectories

The analysis takes a sharp turn when addressing model training, which Schneider treats as an episodic rather than continuous workload. He anchors his calculation on the disclosed training run of DeepSeek V3, adjusting upward for likely underreporting. "I believe Chinese labs have strong incentives to minimize disclosed compute, since it reinforces the efficiency narrative and understates chip needs under export control scrutiny," he writes. This skepticism is well-founded, as state-backed entities often have strategic reasons to obscure their true capabilities.

Schneider calculates that China's training needs imply a dedicated cluster of roughly 128,000 high-end chips. However, he warns that not all compute is created equal. "China's training clusters require their best chips, like the H800s and smuggled Blackwells," he notes. This creates a paradox: the nation may have ample capacity for running existing models at scale while facing a severe constraint on training the next generation of frontier models. This dynamic mirrors the challenges seen in the history of Nvidia's chip architecture, where specific hardware generations became bottlenecks for training efficiency, forcing a scramble for the latest silicon.

Looking ahead, Schneider models three growth scenarios for 2026. Even the most conservative projection suggests China's fleet could double, while the aggressive scenario implies a scale approaching the entire current global stock of high-end AI compute. "That's probably not happening by the end of 2026, but it illustrates why demand-side analysis matters: the numbers we're dealing with today are large, but the trajectory is what's really consequential," he argues. This forward-looking perspective shifts the conversation from a static snapshot of current capabilities to a dynamic forecast of future power.

Bottom Line

Schneider's most compelling contribution is the demonstration that China's AI capacity is not a fragile house of cards built on smuggling, but a massive, demand-driven ecosystem that is rapidly scaling. The argument's greatest vulnerability remains the uncertainty of utilization rates, yet the data suggests that even conservative estimates point to a formidable competitor. Policymakers must now watch not just the flow of chips across borders, but the velocity at which China's domestic demand is consuming the hardware it already possesses.

Deep Dives

Explore these related deep dives:

  • Nvidia

    This specific chip was engineered by NVIDIA solely to comply with US export controls, serving as the primary vehicle for the 'smuggling reports' and 'cloud access' loopholes the article analyzes to estimate China's actual capacity.

  • Model collapse

    Understanding this phenomenon of AI degradation from training on synthetic data is essential to grasp why the article's demand-side calculation of 'training clusters' matters for the long-term viability of China's domestic AI ecosystem.

  • Rate-monotonic scheduling

    The article's entire mathematical model hinges on the contested '55% inference utilization' and '80% training utilization' figures, making this technical concept the critical variable that determines whether China's compute gap is a bottleneck or an illusion.

Sources

How much compute does China have?

by Jordan Schneider · ChinaTalk · Read full article

Yesterday, my colleague Aqib Zakaria published an estimate of China’s supply-side compute capacity. By tallying chip shipments, smuggling reports, domestic production, and estimated Western cloud access, he arrived at ~2.7 million H100-equivalent GPUs.

Today, I’ll try to approach the same question from the demand side. Rather than counting chips, I’m counting workloads to estimate how much compute China’s AI ecosystem needs. The demand-side approach is less precise than the supply-side (which means much more vibes-based guessing from me), but it offers a cross-check on whether the supply-side figure holds up.

The number I landed on is ~2.8 million H100e, which is nearly identical to Aqib’s estimate — though it’s entirely possible we’re both wrong in ways that happen to equalize. (We importantly did not share our numbers until after they were calculated!)

Why does this matter? For one, export controls on advanced chips are only as good as our understanding of what those chips actually enable. If policymakers are debating whether to tighten restrictions on H200s or close cloud compute loopholes, they should probably have a concrete sense of what China’s AI ecosystem actually demands. Looking at demand could further allow us to infer how much compute Chinese companies are renting from Western cloud service providers.

The rest of this piece walks through how I got my number. In a nutshell…

I estimate China’s AI infrastructure requires roughly 237,000 H100e running continuously to serve all inference workloads, such as chatbots, enterprise APIs, recommendation algorithms, video generation, surveillance, and more. Dividing by 55% — my central estimate for how intensively deployed inference chips are actually being used, based on various conversations and the common wisdom of the utilization discourse — gives a minimum inference installed base of ~431,000 H100e. Add a dedicated training cluster of ~128,000 H100e, used episodically for model training and research, and you get a minimum total installed base of around 558,000 H100e. Scale up to account for chips in reserve, in transit, between runs, or not yet fully online, and you reach 2.8 million at 20% whole-fleet utilization (one again based on conversations and the common range proposed by others). The training cluster is handled differently from inference: rather than dividing by the 55% inference utilization rate, it’s derived from total annual training compute divided by available chip-hours per year at a higher utilization rate (80%) due to the increased efficiency of usage during training.

I’ll walk ...