Recently, a Morgan Stanley research report exposed the key information of R2 in advance, which attracted widespread attention in the industry. This article will explain the core highlights of DeepSeek R2 in detail for your reference.
DeepSeek R2 model finally has new news!
Recently, Morgan Stanley, a top investment bank, released a research report that exposed DeepSeek’s upcoming new generation model – R2.
This research report is not long, but the information is huge. Without further ado! Come and take a look with Mr. Crow.
01 Two core highlights: doubling parameters and plummeting 88%
There are two core changes in R2 this time: doubling of parameters and cost reduction.
Let’s talk about the former, the volume of the R2 model has soared directly to 1.2 trillion parameters, more than twice that of the previous R1, and the active parameters have also increased from 37 billion to 78 billion.
This idea is actually similar to Google’s Gemini and Anthropic’s Claude – enhancing the quality of inference by increasing the number of parameters involved in the operation in each call. For example, Gemini 2.5 Pro uses 30% more tokens than R1-0528.
Although this will increase the cost of computing, for the end user, it is worth it in exchange for a smarter and more “thinking” model.
Another big change is that the cost of the model is cheaper.
B-end product manager’s ability model and learning improvement
The first challenge faced by B-end product managers is how to correctly analyze and diagnose business problems. This is also the most difficult part, product design knowledge is basically not helpful for this part of the work, if you want to do a good job in business analysis and diagnosis, you must have a solid …
View details >
According to Morgan Stanley, the input cost per million tokens of R2 is only $0.07, which is more than half of R1’s $0.15-0.16; the output cost is even more severe, from $2.19 per million tokens of the original R1 model to $0.27.
This price is much lower than state-of-the-art models like OpenAI’s GPT-4o. GPT-4o’s API is priced at $2.50 per 1 million tokens for input and $10.00 per 1 million tokens for output.
In other words, R2 is 97% cheaper than state-of-the-art models such as GPT-4o.
For startups, developers, and enterprises, this is undoubtedly a huge temptation: not only the reasoning effect is strong, but also the cost of use can be pushed to the floor price, and the commercialization space is greatly expanded.
In addition to parameters and price, R2 also has three major upgrade highlights:
1) Multilingual reasoning and code generation capabilities have been greatly improved. It is said that DeepSeek is prioritizing R2’s coding capabilities, as well as its ability to speak languages other than English, to expand the model’s potential impact and applicability to global audiences;
2) A more efficient MoE hybrid expert architecture is introduced, with higher inference efficiency and smarter activation parameter selection. This architecture divides the AI model into independent subnets that are selectively activated based on input. This method can significantly reduce the computational cost of pre-training and achieve faster inference performance.
3) Multimodal support is stronger, and the visual ability is also a higher level than before;
To put it simply, R2 not only “thinks deeper”, but also “sees more clearly”.
While improving performance, DeepSeek is also getting rid of its dependence on H100 and achieving breakthroughs in local computing power.
According to Morgan Stanley, this time DeepSeek did not rely on the traditional NVIDIA H100 chip, but used Huawei’s Ascend 910B chip cluster. Although Huawei has not yet caught up with NVIDIA ecologically, this is already an important breakthrough for domestic chips to participate in the actual combat of large models.
DeepSeek is working to build a local hardware supply chain to reduce its reliance on US-made chips. Today, a strong local supply chain system has been formed behind the R2 model.
02 Version optimization has just been completed, R2 still has to wait?
Currently, DeepSeek’s model family consists of three main products:
- V series (V1~V4): General large model, comprehensive coverage of reasoning, efficiency, and agent capabilities;
- R series (R1 → R2): focus on deep reasoning, mathematics, tool chain integration and other capabilities;
- Prover-V2: a model specifically optimized for mathematical generation;
Among them, the new model of DeepSeek-Prover-V2 was released in April this year. Prover-V2 is not a generic model, but a highly specialized mathematical proof model based on DeepSeek V3, with an improved MoE architecture and a compressed kV cache to reduce memory consumption.
Aside from the newly released mathematical proof model, DeepSeek’s large model upgrade route is mainly methodical:
- V4: As a comprehensive iteration of the main line of the general model, it emphasizes the improvement of inference performance, efficiency, and agentization capabilities.
- R2: As an enhanced upgrade of the Reasoner dedicated line, it benchmarks OpenAI’s o3 subsequent iteration model and Gemini 2.5 Pro 0605 & official version & future Gemini 3 to further improve the performance of mathematics + tool chain + multi-step reasoning;
In the past, DeepSeek’s model iteration rhythm has been relatively fixed, basically following the law of “2 small and 1 large”: that is, there will be a small version update every two months, followed by a major version update.
Taking the general model V1 as an example, V1 will be released in November 2023, V2 will be released in May 2024, and V3 will be released in December 2024. At this pace, in June and July 2025, DeepSeek will have a relatively large version change.
Shortly before the release of R2 (also on May 29), DeepSeek also released an enhanced version with no version number change, R1-0528.
Although the architecture has not been moved, this version introduces reinforcement learning training (RLHF), which significantly enhances the depth of reasoning. The official evaluation shows that 99 million tokens were used to complete the evaluation task, which is 40% more than the original R1, with deeper thinking, more complex processes, and better performance.
The report card is also beautiful:
- AIME 2024 (Mathematics Competition): +21 points
- LiveCodeBench (code generation): +15 points
- GPQA Diamond (Scientific Reasoning): +10 points
- Humanity’s Last Exam (Knowledge Reasoning): +6 points
User feedback is also positive, especially in terms of logic, programming, and interaction.
This time the R1-0528 has been upgraded so violently that many people are beginning to wonder: Is this the legendary R2? However, there is no positive response from the official at present, and this statement has not yet been hammered.
Although Morgan Stanley said that R2 is coming, according to DeepSeek’s usual rhythm, the real R2 is estimated to have to wait a little longer. This wave of upgrades is more like a major version optimization of “pressing the line and sneaking away” rather than a regular update.