MarkTechPost@AI 2024年11月21日
Chinese AGI Startup ‘StepFun’ Developed ‘Step-2’: A New Trillion-Parameter MoE Architecture Model Ranking 5th on Livebench
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

上海人工智能初创公司StepFun开发了名为Step-2的千亿参数混合专家(MoE)语言模型,该模型在全球AI模型基准测试平台Livebench上排名第五,成为中国性能最佳的大语言模型。Step-2采用MoE架构,能够更有效地分配计算资源,并支持长达16000个token的上下文长度,在指令遵循、推理和数据分析方面表现出色。虽然在编码和数学方面仍有提升空间,但Step-2展现了中国AI企业在构建大型语言模型方面的实力,并通过API和应用集成的方式,将先进技术推广给更广泛的用户群体,为AGI研究和行业应用奠定了基础。

🚀 **Step-2是首个由中国公司开发的千亿参数MoE模型,在Livebench上排名第五,成为中国性能最佳的大语言模型。** 该模型由上海人工智能初创公司StepFun开发,标志着中国在大型语言模型领域取得的显著进步。

💡 **Step-2采用混合专家(MoE)架构,能够更有效地分配计算资源。** MoE架构通过路由机制仅激活特定任务所需的模型参数,从而在扩展参数规模的同时,避免计算资源的过度消耗,提高模型效率。

📊 **Step-2在指令遵循、推理和数据分析等方面表现出色,但在编码和数学方面仍有提升空间。** 模型在指令遵循方面得分达到86.57,推理和数据分析得分分别为58.67和54.86,展现出强大的语言理解和信息处理能力,但编码和数学得分分别为46.87和48.88,表明未来仍需进一步优化。

🌐 **Step-2通过API平台和应用集成的方式,将其先进技术推广给更广泛的用户群体。** StepFun将Step-2集成到“阅文”应用中,使普通用户能够体验到最先进的语言模型,同时开放API平台,方便开发者和研究人员使用,促进AI技术的普及和应用。

🇨🇳 **Step-2的成功表明中国AI企业在构建大型语言模型方面具备强大的实力,为推动全球AI发展贡献力量。** Step-2的排名表明,中国在AI领域的技术实力不断增强,未来将有更多来自中国的AI企业在全球舞台上发挥重要作用,推动AI技术的多元化发展。

In the evolving landscape of artificial intelligence, building language models capable of replicating human understanding and reasoning remains a significant challenge. One major hurdle in the development of large language models (LLMs) is balancing computational efficiency with expansive capabilities. As models grow larger to capture more complex relationships and generate better predictions, the computational costs increase significantly. Meanwhile, general-purpose LLMs must handle a range of tasks—such as instruction following, coding, and reasoning—often struggling to maintain consistent performance across all dimensions. This inconsistency poses a notable bottleneck, particularly for those aiming to advance toward artificial general intelligence (AGI).

Introducing Step-2: A Trillion-Parameter MoE Model

StepFun, a Shanghai-based AI startup focused on advancing AGI, has recently developed Step-2, a trillion-parameter Mixture of Experts (MoE) language model. This model has gained attention by ranking 5th on Livebench, a prominent global benchmarking platform that evaluates AI models based on their overall performance across diverse tasks. Step-2 is the first trillion-parameter MoE model developed by a Chinese company and ranks as China’s top-performing LLM. It holds its position behind some of the most advanced models from industry leaders like OpenAI and Google. This achievement reflects the advanced technology StepFun is building and its effort to contribute to the global AI community from within China.

Architecture and Technical Insights

The Step-2-16k model is built using MoE architecture, a design approach that allocates computational resources more efficiently compared to traditional fully-dense models. Mixture of Experts uses a routing mechanism that activates only a subset of the model’s parameters—the experts—for any given task, enabling the scaling of parameters without proportionally increasing computation. The trillion-parameter scale allows Step-2 to capture a nuanced understanding of language, offering substantial improvements in instruction-following capabilities and reasoning tasks. It also supports a context length of up to 16,000 tokens, which is particularly useful for applications requiring long-term dependencies, such as document analysis or complex conversations.

Performance Metrics and Areas for Improvement

Technically, the Step-2 model has demonstrated a range of strengths, with high scores in several areas. The model achieved an Instruction Following (IF) score of 86.57, indicating its ability to comprehend and act upon complex instructions. Additionally, Step-2 secured a reasoning score of 58.67 and a data analysis score of 54.86, highlighting its proficiency in processing and understanding information. However, the model showed room for improvement in coding and mathematics, scoring 46.87 and 48.88, respectively. Despite these areas needing further optimization, Step-2 effectively leverages MoE to balance parameter scale with task-specific efficiency. The model’s development focused heavily on research and development (R&D) rather than marketing, ensuring robust performance and reliability even at this large scale.

Significance and Accessibility

The significance of Step-2 lies in both its scale and its competitive edge as the first trillion-parameter model from a Chinese startup to achieve such a high ranking. As the AI community grows increasingly concerned with accessibility and inclusiveness, StepFun has made Step-2 accessible through its API platform, making it available for developers and researchers. Additionally, Step-2 has been integrated into the consumer application “Yuewen,” broadening its reach and offering the general public an opportunity to interact with a state-of-the-art language model. The model’s ranking—5th globally—demonstrates that Chinese startups are capable of producing high-quality AI systems, and it suggests a future where diverse players contribute significantly to the AI field, thereby reducing the concentration of AI expertise among only a few Western companies.

Conclusion

StepFun’s Step-2 represents progress not only for the company but also for the Chinese AI community. By ranking 5th on Livebench, Step-2 showcases its capability in areas like instruction following and reasoning, while also highlighting areas where further refinement is needed, such as coding and mathematics. Built with an MoE architecture and equipped with a trillion parameters, Step-2’s strengths are a testament to the thoughtful application of advanced architectures for creating expansive and efficient models. With its accessible implementation via APIs and consumer integration, Step-2 also demonstrates StepFun’s commitment to bringing advanced technology to users worldwide. While there is work to be done, particularly in enhancing coding and mathematical capabilities, Step-2’s performance and architecture signify the increasing maturity of AI research and development from regions beyond the traditional powerhouses. This accomplishment positions StepFun as a key player in the AI landscape, setting the stage for further developments in AGI research and industry applications.


Check out the Details. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter.. Don’t Forget to join our 55k+ ML SubReddit.

[FREE AI VIRTUAL CONFERENCE] SmallCon: Free Virtual GenAI Conference ft. Meta, Mistral, Salesforce, Harvey AI & more. Join us on Dec 11th for this free virtual event to learn what it takes to build big with small models from AI trailblazers like Meta, Mistral AI, Salesforce, Harvey AI, Upstage, Nubank, Nvidia, Hugging Face, and more.

The post Chinese AGI Startup ‘StepFun’ Developed ‘Step-2’: A New Trillion-Parameter MoE Architecture Model Ranking 5th on Livebench appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Step-2 MoE模型 千亿参数 大型语言模型 人工智能
相关文章