← Back to Library

DeepSeek v4

Jordan Schneider delivers a sobering autopsy of China's most celebrated AI experiment, revealing that DeepSeek's "national champion" status was built on a foundation of deferred commercial reality and a desperate, high-stakes gamble on domestic hardware. The piece's most striking revelation isn't the model's performance, but the admission that the lab's ideological purity—its refusal to monetize or partner early—cost it the very talent and compute needed to stay competitive.

The Cost of Idealism

Schneider argues that DeepSeek's trajectory was defined by a fatal misalignment between its mission and market realities. While Western labs like OpenAI pivoted to revenue-generating products to fund their research, DeepSeek CEO Liang Wenfeng remained fixated on a pure research ethos. Schneider writes, "For many years, Chinese companies are used to others doing technological innovation, while we focused on application monetization — but this isn't inevitable." This framing positions DeepSeek not just as a tech company, but as a vehicle for national pride, attempting to break the cycle of "freeriding" on Western hardware advances.

DeepSeek v4

However, the author suggests this idealism became a liability. By refusing to build a scaled consumer product or partner with a Chinese hyperscaler, the lab "bled talent and lost the lead he had over his domestic competitors." The evidence is stark: core contributors fled to rivals like Tencent and ByteDance, leaving the lab struggling to staff even a new marketing unit. Schneider notes that while Liang focused on "hardcore research," competitors were capturing the market, with ByteDance's Doubao becoming China's most-downloaded chatbot.

"The golden age of nonprofit AI development is over."

This quote, attributed to a Qwen employee, serves as the piece's emotional anchor. It underscores a shift in the Chinese tech landscape where capital constraints and geopolitical pressure have made pure research unsustainable. Critics might argue that without such "pure" labs, China would lack the foundational innovations that later fuel commercial applications. Yet, Schneider's reporting suggests that DeepSeek's delay in commercialization left it vulnerable to a "post-DeepSeek era" where the window for independent, non-profit AI leadership has closed.

The Hardware Trap

The article's most technical insight concerns the paradox of DeepSeek V4: a model designed for domestic chips that still relied on foreign silicon for its creation. Schneider details how the lab attempted to migrate its training framework from Nvidia to Huawei's Ascend chips, a move that resulted in a "relatively serious case of training failure" in mid-2025. The internal friction was palpable; insiders reported that "opinions on the direction of training were not entirely unified," leading to a belated release.

Despite these hurdles, V4 represents a significant architectural pivot. Schneider highlights that the model uses a domain-specific language called TileLang rather than Nvidia's CUDA, allowing it to run on various domestic hardware like Cambricon and Biren. "V4 is, from top to bottom, a model designed for domestic chips," Schneider observes, calling it a "reality forced into being by this computing power struggle." This is a crucial distinction: the model isn't just a technical achievement; it is a geopolitical necessity born of supply chain constraints.

"The computing power game is, in many ways, a top-level geopolitical game."

The author contextualizes this struggle by referencing the sheer scale of the deficit. While Huawei plans to ship 750,000 Ascend 950 chips this year, Schneider points out that this volume equals "just one week of quality-adjusted American chip production." This comparison drives home the severity of the bottleneck. The piece also draws a parallel to the history of the MIT License and the open-source movement, noting that while DeepSeek released V4 under the permissive MIT license, the underlying hardware constraints may limit its global impact compared to US models running on next-generation Blackwell chips.

The Price of Access

Beyond the technical and political, Schneider explores the human element of AI access. The "DeepSeek moment" initially offered Chinese users affordable access to frontier models, a stark contrast to the restricted access of American labs. However, as the industry matures, the cost of tokens is rising, creating a new form of exclusion. Schneider cites a recent cultural shift, referencing a 2017 article "The People Long for Zhou Hongyi" and its 2026 sequel, "The People Long for DeepSeek."

The new article critiques the industry's push for "token anxiety," where companies aggressively encourage usage to drive revenue. Schneider writes, "When token usage costs can't be brought down... aggressively pushing token consumption — even tying it to performance reviews — amounts to manufacturing token anxiety." This critique resonates with the broader theme of the piece: the tension between technological ambition and the economic reality of sustaining it.

"Calling it manufacturing AI anxiety wouldn't be an overstatement either."

This observation challenges the narrative of AI as a democratizing force. If the cost of inference remains high, the benefits of these models will be concentrated among those who can afford them, mirroring the monopolistic trends of the past. Schneider suggests that while DeepSeek's symbolism persists, its ability to provide affordable, high-quality access is diminishing as the "golden age" of open, low-cost AI gives way to a more expensive, commercialized future.

Bottom Line

Schneider's most compelling argument is that DeepSeek's failure to commercialize early was not just a business mistake, but a structural vulnerability that left China's AI ambitions exposed to hardware sanctions. The piece's greatest strength lies in its refusal to romanticize the "national champion" narrative, instead exposing the internal fractures and talent drain that accompanied it. The biggest vulnerability in the analysis, however, is the assumption that domestic hardware will inevitably catch up; the sheer disparity in production capacity between the US and China remains a formidable, perhaps insurmountable, hurdle. Readers should watch not just for the next model release, but for whether DeepSeek can pivot from a research lab to a sustainable business before its talent pool evaporates completely.

Deep Dives

Explore these related deep dives:

  • List of Huawei products

    The article details DeepSeek's difficult migration from Nvidia to this specific domestic chip architecture, a technical pivot that caused training failures and delayed the V4 release.

  • MIT License

    While most Chinese AI labs are retreating from open source, DeepSeek's choice of this specific permissive license highlights its unique 'idealism' and strategic divergence from the industry trend.

  • Goodhart's law

    The article's observation that V4 'feels further' behind US models than its benchmark scores suggest illustrates this principle, where optimizing for specific metrics fails to capture actual frontier capability.

Sources

DeepSeek v4

by Jordan Schneider · ChinaTalk · Read full article

Finally, DeepSeek V4 is here. The Pro and Flash models are available through DeepSeek’s website, mobile apps, and API access as of April 23, and the lab has also released its technical report.

Bucking a recent trend of Chinese AI labs moving away from open source, V4 was released under the highly permissive MIT license. It performs admirably on various benchmarks and leads the pack of Chinese open models, but did not close the gap with closed models from the US, with the authors themselves admitting in the paper that V4 is “3 to 6 months behind” state-of-the-art frontier models (though we think it feels further). And as we will discuss later, while its architecture shows progress towards indigenizing the Chinese stack, the model probably still relied on Nvidia GPUs.

Is V4 a letdown? Today on ChinaTalk, we bring you our takes alongside those from Chinese observers on:

Troubles at the lab prior to V4’s arrival;

Why DeepSeek’s idealism may not hold;

What V4 did — and did not — achieve with domestic hardware;

And why DeepSeek’s symbolism persists inside China, even after it lost the frontier

Translations were drafted with the assistance of Claude Opus 4.7, and then edited for accuracy and fluency. Bold markings added by the editor.

How V4 Got Here.

Chinese tech journalists have doggedly followed the DeepSeek story. Zhou Xinyu 周鑫雨 of 36Kr, a prominent Beijing-based tech news outlet, has some behind-the-scenes scoops.

The reasons behind [V4’s] belated arrival are related to migrating its training framework from NVIDIA to Huawei Ascend, as well as to internal decision-making changes at DeepSeek. We learned that in mid-2025, DeepSeek ran into a relatively serious case of training failure.

“At the time, DeepSeek was facing the problem of re-adapting to chips,” one insider mentioned. “Internally, opinions on the direction of training were not entirely unified. Liang Wenfeng put forward some of his own demands, but it was difficult to find compromises at the execution level.”

However, contrary to outside speculation that the new model might support multimodal generation and understanding, V4 remains a language model. The decision to postpone multimodal generation training stems mainly from constraints on computing power and cash.

Multiple insiders told AI Emergence [a 36Kr sub-brand focusing on AI] that DeepSeek’s external financing window opened in mid-April 2026. Internally, the trigger was that DeepSeek needed more funding to train models with larger parameter scales, while also ...