
DeepSeek's new AI model paves way for its next-gen architecture
What's the story
Chinese artificial intelligence (AI) developer DeepSeek has launched its latest model, the DeepSeek-V3.2-Exp. The company describes this "experimental release" as more efficient in training and better at processing long sequences of text than its predecessors. The Hangzhou-based firm announced the new model on the Hugging Face developer forum, calling it an "intermediate step toward our next-generation architecture."
Future plans
Next-gen architecture to be major milestone
DeepSeek's next-generation architecture is expected to be a major milestone for the company, following the success of its previous models, V3 and R1. The firm believes that this new architecture will be its most significant product launch since those earlier versions made waves in Silicon Valley and shocked tech investors outside China.
Innovation
New model comes with 'DeepSeek Sparse Attention'
The V3.2-Exp model comes with a feature called DeepSeek Sparse Attention. The company claims that this innovative mechanism can reduce computing costs and improve certain aspects of model performance. As part of this new development, DeepSeek has announced that it is slashing API prices by "50%+". The company revealed this in a recent post on X.
Market impact
Next-gen model could threaten rivals
While DeepSeek's next-generation architecture may not shake up the market like its predecessors did in January, it could still pose a major threat to domestic competitors such as Alibaba's Qwen and US-based companies such as OpenAI. This is if the new model can show high capability at a fraction of what these rivals charge and spend on model training.