Home
About->
Topics->
Studies
Events
Fellows
Downloads
00:00:00 UTC

DeepSeekLaunchesNewAIModelasFundingRumorsSpread

DeepSeek launches its V4 open-source AI model with improved efficiency and coding capabilities amid funding talks with Tencent and Alibaba, intensifying global competition despite trailing top rivals.
Date
Author
Gu Zhaowei
Publisher
Caixin Global
Photo: IC Photo
Photo: IC Photo

Chinese artificial intelligence (AI) startup DeepSeek launched its highly anticipated V4 open-source foundational model on Friday, upping the ante in a race with global peers to develop a technology that underpins data centers and AI applications.

The DeepSeek-V4 model comes in Pro and Flash versions. The Pro version features 1.6 trillion total parameters with 49 billion active parameters, making it an open-weights model trained with the most parameters. The Flash version contains 284 billion total parameters with 13 billion active parameters to provide more cost-effective API services. Both versions support a context window of up to one million tokens.

Some developers expressed slight disappointment with V4, noting it did not significantly distance itself from similar models developed by rivals like Moonshot AI and Zhipu AI. Data from third-party AI model performance evaluation firm VALS AI showed V4 achieved an average accuracy rate of 63.87% across financial, legal and coding tests, lagging behind Claude Opus 4.6, Gemini 3.1 Pro Preview, GPT-5.4 and Kimi K2.6. In terms of output speed, V4 is inferior to Kimi K2.6 from Moonshot AI.

DeepSeek acknowledged a gap with the world’s top models, saying the delivery quality of V4’s agentic coding approaches that of Claude Opus 4.6’s non-thinking mode, though a gap remains with the latter’s thinking mode. V4 slightly trails Gemini-Pro-3.1 in world knowledge but matches top global closed-source models in math, STEM and competitive coding.

According to DeepSeek, V4 focuses on optimizing computing resource consumption and agent capabilities. The model features the DeepSeek Sparse Attention mechanism that compresses token dimension to reduce the computational cost of processing long-context sequences. Compared with DeepSeek-V3.2, the Flash version reduces single-token computing volume for million-token scales by 90%, while the Pro version reduces it by nearly 70%. The company also said V4 has been optimized for mainstream agent products like Claude Code, OpenClaw, OpenCode and CodeBuddy, with improved performance in coding and document generation tasks.

V4’s launch coincides with recent rumors that Tencent Holdings Ltd. and Alibaba Group Holding Ltd. are in talks to invest in DeepSeek at a valuation that could exceed $20 billion, though specific financing amounts and valuations might adjust as negotiations continue.

DeepSeek has yet to generate significant revenue, and its founder Liang Wenfeng has long resisted bringing in external capital. Industry analysts suggest the potential shift toward fundraising could be related to the delayed release of the V4 model and the departure of several core research talents.

Contact editor Ding Yi (yiding@caixin.com)

References

caixinglobal.com is the English-language online news portal of Chinese financial and business news media group Caixin. Global Neighbours is authorized to reprint this article.