← Back to Library

How China hopes to build agi through self-improvement

Most observers assume China is playing catch-up in the race for Artificial General Intelligence, focusing only on economic deployment while the U.S. chases a software-based intelligence explosion. Jordan Schneider dismantles this comforting narrative, revealing a distinct, state-aligned strategy that rejects the Silicon Valley obsession with recursive code in favor of a physical, embodied path to machine cognition. For busy strategists, the implication is stark: the next great leap in AI may not come from a server farm in California, but from a robot learning to navigate a factory floor in Shenzhen.

The Misread of Chinese Intent

Schneider begins by challenging the prevailing "thank god" instinct among Western analysts who believed Beijing viewed AI merely as an economic engine. He notes that while this interpretation held water in 2017, the landscape has shifted dramatically by 2026. "While in 2017 the term 'general-purpose artificial intelligence' used by Beijing could safely be interpreted as general-purpose AI rather than AGI, the same cannot be asserted now that the term has resurfaced in 2026," Schneider writes. The author points to the explicit language in China's 15th Five-Year Plan, which distinguishes between general-purpose large models and the pursuit of AGI as separate tracks. This is a crucial distinction often missed in Western media, which tends to lump all Chinese AI progress into a single bucket of "catch-up" innovation.

How China hopes to build agi through self-improvement

The author argues that this reframing changes the entire geopolitical calculus. If China is indeed pursuing AGI, they are not doing so by copying the American model of Recursive Self-Improvement (RSI), where software agents write better software in a digital loop. Instead, "Chinese thinking converges on something more embodied: human-level intelligence that requires physical-world interactions." This approach is not a top-down Manhattan Project but a bottom-up movement driven by compute constraints, gradually gaining influence in Beijing's top policy circles. Critics might note that state rhetoric often outpaces actual capability, but the convergence of voices from startup CEOs to the Chinese Academy of Sciences suggests a genuine strategic pivot rather than mere propaganda.

The American approach to AGI is a race to build a software machine god; the Chinese approach is a race to build a brain that can walk, see, and touch.

The Embodied Alternative

The core of Schneider's analysis lies in detailing the three-step architecture of this "AGI with Chinese Characteristics." The first step is multimodality and world models, moving beyond the "predict the next word" paradigm to "predict the next state of the world." Schneider highlights the Beijing Academy of Artificial Intelligence, which predicts that world models will be the primary pathway to AGI in 2026. This shift is significant because it prioritizes spatial-temporal continuity and causality over linguistic probability.

The second, and perhaps most defining, step is embodied AI. Here, Schneider draws a sharp contrast with the U.S. focus on agentic coding. He cites Zhang Peng, CEO of Z.ai, who describes the process: "First, you build a brain... Then you equip it with hands and feet so it can call upon the world model to solve problems... The results of that interaction are fed back as a reinforcement signal." This echoes historical debates in cognitive science regarding embodied cognition, where the physical body is seen not as a vessel for the mind, but as a necessary component for intelligence itself. As Andrew Yao, a Turing award winner, states, "the development of embodied AI is crucial for AI to acquire the capacity to comprehend the physical world."

This focus on the physical world addresses a critical bottleneck: data. While the U.S. relies on vast static datasets scraped from the internet, Schneider argues that Chinese researchers see the physical world as an "irreducibly more complex" source of training signals. "Unlike the RSI discourse at the U.S. frontier lab, which increasingly coalesced around agentic coding as the primary lever, the Chinese ecosystem has no single consensus path," Schneider observes. Instead, they are building the "brain" for robots, with companies like Alibaba and Ant Group open-sourcing universal brains for physical AI. This strategy leverages the sheer scale of China's manufacturing sector to generate the kind of real-world interaction data that no static corpus can provide.

Closing the Loop

The final piece of the puzzle is the closed loop. Schneider explains that for Chinese scientists, true self-improvement cannot happen in a vacuum. "Alibaba CEO Wu Yongming argues that AI's self-improvement loop cannot close on static data alone, which, however vast, is ultimately bounded by what humans have already expressed," Schneider writes. The vision is a system that, through physical interaction, builds its own training infrastructure and optimizes its own model architectures. This is a direct challenge to the Silicon Valley timeline, which assumes a rapid software-driven intelligence explosion. In the Chinese model, the loop is closed through the body, not the code.

Schneider notes that while DeepSeek focuses on multimodality and Z.ai on coding agents, the broader consensus among scientists like Zhang Bo and Zhou Bowen places embodied interaction as the final stage of AGI development. This is a departure from the "superbrain" narrative. "Rather than a superbrain built from code as perceived by many in Silicon Valley, Chinese AI actors increasingly narrate a different endpoint," Schneider concludes. The implication is that the race is not over, and the finish line may look nothing like what Washington or San Francisco expects.

Bottom Line

Schneider's strongest contribution is exposing the dangerous asymmetry in how the U.S. and China define the path to superintelligence; the U.S. is betting on a digital explosion, while China is betting on physical integration. The argument's vulnerability lies in the immense difficulty of scaling embodied AI compared to software, but the sheer volume of state and private investment suggests this is a bet Beijing is willing to make. The reader must watch not just for breakthroughs in code, but for the rapid deployment of autonomous robots in Chinese industrial zones, as that is where the next phase of the AI race will be decided.

Deep Dives

Explore these related deep dives:

  • Embodied cognition

    The article argues that China's path to AGI diverges from Silicon Valley's software-centric view by prioritizing physical-world interaction, making this technical distinction central to understanding the geopolitical divergence.

Sources

How China hopes to build agi through self-improvement

by Jordan Schneider · ChinaTalk · Read full article

Most observers assume China is playing catch-up in the race for Artificial General Intelligence, focusing only on economic deployment while the U.S. chases a software-based intelligence explosion. Jordan Schneider dismantles this comforting narrative, revealing a distinct, state-aligned strategy that rejects the Silicon Valley obsession with recursive code in favor of a physical, embodied path to machine cognition. For busy strategists, the implication is stark: the next great leap in AI may not come from a server farm in California, but from a robot learning to navigate a factory floor in Shenzhen.

The Misread of Chinese Intent.

Schneider begins by challenging the prevailing "thank god" instinct among Western analysts who believed Beijing viewed AI merely as an economic engine. He notes that while this interpretation held water in 2017, the landscape has shifted dramatically by 2026. "While in 2017 the term 'general-purpose artificial intelligence' used by Beijing could safely be interpreted as general-purpose AI rather than AGI, the same cannot be asserted now that the term has resurfaced in 2026," Schneider writes. The author points to the explicit language in China's 15th Five-Year Plan, which distinguishes between general-purpose large models and the pursuit of AGI as separate tracks. This is a crucial distinction often missed in Western media, which tends to lump all Chinese AI progress into a single bucket of "catch-up" innovation.

The author argues that this reframing changes the entire geopolitical calculus. If China is indeed pursuing AGI, they are not doing so by copying the American model of Recursive Self-Improvement (RSI), where software agents write better software in a digital loop. Instead, "Chinese thinking converges on something more embodied: human-level intelligence that requires physical-world interactions." This approach is not a top-down Manhattan Project but a bottom-up movement driven by compute constraints, gradually gaining influence in Beijing's top policy circles. Critics might note that state rhetoric often outpaces actual capability, but the convergence of voices from startup CEOs to the Chinese Academy of Sciences suggests a genuine strategic pivot rather than mere propaganda.

The American approach to AGI is a race to build a software machine god; the Chinese approach is a race to build a brain that can walk, see, and touch.

The Embodied Alternative.

The core of Schneider's analysis lies in detailing the three-step architecture of this "AGI with Chinese Characteristics." The first step is multimodality and world models, moving beyond the "predict the next word" paradigm ...