MarkTechPost@AI 05月30日 10:40
DeepSeek Releases R1-0528: An Open-Source Reasoning AI Model Delivering Enhanced Math and Code Performance with Single-GPU Efficiency
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

DeepSeek公司发布了其R1推理模型的更新版本DeepSeek-R1-0528。该模型在数学、编程和通用逻辑推理方面有所增强,成为OpenAI的o3和谷歌的Gemini 2.5 Pro等领先模型的强大开源替代品。DeepSeek-R1-0528在AIME 2025数学基准测试中的表现从70%提高到87.5%,并在代码生成任务中表现出色,仅次于OpenAI的o4 mini和o3模型。DeepSeek坚持开源策略,发布了MIT许可下的R1-0528,并提供精简版本,以提升AI技术的可及性。

🚀DeepSeek-R1-0528在数学推理方面取得了显著进步,其在AIME 2025数学基准测试中的表现从之前的70%提升至87.5%。这得益于更深入的推理过程,平均每个问题使用23,000个Token,而之前的版本为12,000个。

💻在代码生成方面,R1-0528的表现也令人印象深刻。根据LiveCodeBench基准测试,该模型仅次于OpenAI的o4 mini和o3模型,超越了xAI的Grok 3 mini和阿里巴巴的Qwen 3。

💡DeepSeek致力于开源,R1-0528在MIT许可证下发布,开发者可自由修改和部署。模型权重可在Hugging Face上获取,并提供了详细的文档,方便本地部署和API集成。

✨为了提高AI解决方案的可及性,DeepSeek还发布了DeepSeek-R1-0528-Qwen3-8B,这是R1-0528的精简版本,基于阿里巴巴的Qwen3-8B微调而成。该模型在AIME 2024基准测试中,在开源模型中表现出色,设计为可在单个GPU上高效运行。

⚠️尽管DeepSeek在AI领域取得了进步,但R1-0528模型在内容审核方面表现出比其前身更严格的限制。独立测试表明,该模型避免或限制对政治敏感话题的回复,例如天安门广场抗议和台湾的地位。

DeepSeek, the Chinese AI Unicorn, has released an updated version of its R1 reasoning model, named DeepSeek-R1-0528. This release enhances the model’s capabilities in mathematics, programming, and general logical reasoning, positioning it as a formidable open-source alternative to leading models like OpenAI’s o3 and Google’s Gemini 2.5 Pro.

Technical Enhancements

The R1-0528 update introduces significant improvements in reasoning depth and inference accuracy. Notably, the model’s performance on the AIME 2025 math benchmark has increased from 70% to 87.5%, reflecting a more profound reasoning process that averages 23,000 tokens per question, up from 12,000 in the previous version. This enhancement is attributed to increased computational resources and algorithmic optimizations applied during post-training.

In addition to mathematical reasoning, the model has shown improved performance in code generation tasks. According to LiveCodeBench benchmarks, R1-0528 ranks just below OpenAI’s o4 mini and o3 models, outperforming xAI’s Grok 3 mini and Alibaba’s Qwen 3 in code generation tasks.

Open-Source Model Weights

DeepSeek continues its commitment to open-source and open weights AI by releasing R1-0528 under the MIT license, allowing developers to modify and deploy the model freely. The model’s weights are available on Hugging Face, and detailed documentation is provided for local deployment and API integration . This approach contrasts with the proprietary nature of many leading AI models, promoting transparency and accessibility in AI development.

Distilled Model for Lightweight Deployment

Recognizing the need for more accessible AI solutions, DeepSeek has also released a distilled version of R1-0528, named DeepSeek-R1-0528-Qwen3-8B. This model, fine-tuned from Alibaba’s Qwen3-8B using text generated by R1-0528, achieves state-of-the-art performance among open-source models on the AIME 2024 benchmark. It is designed to run efficiently on a single GPU, making advanced AI capabilities more accessible to developers with limited computational resources.

Censorship Considerations

While DeepSeek’s advancements in AI are noteworthy, the R1-0528 model has been observed to exhibit stricter content moderation compared to its predecessors. Independent testing revealed that the model avoids or provides limited responses to politically sensitive topics, such as the Tiananmen Square protests and the status of Taiwan, aligning with Chinese regulations that mandate AI models to adhere to content restrictions .

Global Implications

The release of R1-0528 underscores China’s growing influence in the AI sector, challenging the dominance of U.S.-based companies. DeepSeek’s ability to develop competitive AI models at a fraction of the cost of their Western counterparts has prompted responses from companies like OpenAI, which have expressed concerns about the potential for these models to be manipulated by the Chinese government . This development highlights the shifting dynamics in global AI development and the increasing importance of open-source models in fostering innovation and competition.

Conclusion

DeepSeek’s R1-0528 model represents a significant advancement in open-source AI, offering enhanced reasoning capabilities and accessibility for developers. By providing both a full-scale model and a distilled version suitable for single-GPU deployment, DeepSeek is making strides in democratizing AI technology. However, the model’s adherence to content moderation policies reflects the complex interplay between technological advancement and regulatory compliance. As the AI landscape continues to evolve, DeepSeek’s developments will likely play a pivotal role in shaping the future of open-source AI.


Check out the Open-Source Weights and Try it now. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter.

The post DeepSeek Releases R1-0528: An Open-Source Reasoning AI Model Delivering Enhanced Math and Code Performance with Single-GPU Efficiency appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

DeepSeek R1-0528 开源AI 推理模型
相关文章