MarkTechPost@AI 2024年08月16日
InfinityMath: A Scalable Instruction Tuning Dataset for Programmatic Mathematical Reasoning
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

InfinityMath是一种用于程序式数学推理的可扩展数据集,旨在解决数学推理中模型的可扩展性和逻辑一致性问题,能提升AI模型在数学问题上的表现。

🎯InfinityMath由北京人工智能研究院和中国矿业大学的研究团队提出,旨在将数学问题中的数值与问题本身解耦,以减少构建大型多样数据集所需的计算资源。

📈该数据集从七个高质量数学来源创建,拥有超过101,380个数据点,通过多步骤方法实现最大可扩展性和逻辑一致性,可作为增强人工智能模型推理能力的综合工具。

💪用InfinityMath数据集微调的模型在多个基准测试中表现出色,如Llama2模型在GSM8K和MATH数据集中准确性大幅提高,CodeLlama在SVAMP和SimulEq中也有显著提升。

🌟InfinityMath不仅提高了数值准确性,还解决了数学推理中可扩展性和逻辑一致性的两大挑战,推动了数学推理的发展,为AI模型提供了更有效的解决方案。

One primary driver for artificial intelligence research in mathematical reasoning is that it may further increase model understanding and problem-solving abilities on complex mathematical problems. Applications such as these can be very important in education, finance, and technology—fields dependent on the accuracy of solutions and the speed at which problems are solved. This improvement in model capabilities can be transferred to enhancing AI’s performance in several special tasks and at logical processes generally.

One of the most important challenges in this area is that large-scale, high-quality datasets designed for mathematical reasoning take time. Traditional methods of building such datasets often require a lot of computational resources and a large amount of seed data, making them hard to scale. This limits the models’ ability to handle a wide variety of math problems, which ends up causing errors—most especially on value variations. This raises the issue of consistency in logic, where models make wrong adjustments to their reasoning due to these variations and hence reduce the reliability of the models.

State-of-the-art techniques to improve mathematical reasoning in AI, such as Chain-of-Thought and Program-of-Thought, either have models reason through a problem step by step or embed computation into their reasoning. Many of these methods, however, have been expensive in terms of dependence on large datasets and computational resources and should be made more scalable. They should also thoroughly model one of the big challenges—inconsistencies that arise naturally when a change in the numerical values of problems leads to wrong deductions.

A research team from the Beijing Academy of Artificial Intelligence and China University of Mining & Technology has proposed a scalable dataset for programmatic mathematical reasoning called InfinityMath. According to the authors, InfinityMath is supposed to decouple numeric values from problems stated in mathematics. This way, creating a huge, diverse dataset will require a manageable amount of computational resources. The dataset was created from seven high-quality math sources. It has over 101,380 data points. This makes it quite a comprehensive tool for enhancing the reasoning ability of artificial intelligence models.

The methodology of InfinityMath is multistep for maximum scalability and logical consistency. Masking numerical values of math problems creates generic templates that provide a base for generating problem-solving programs. These are then taken as general templates for developing programs that do not refer to specific numbers, logically following the same reasoning procedure for all possible numerical variations. It can efficiently scale data and improve the resiliency of AI models across different mathematical challenges. Such programs could be generated with sophisticated language models like GPT-4 to reduce potential errors and improve overall quality.

The models fine-tuned with the InfinityMath dataset performed quite well across several benchmarks. For example, aided by the InfinityMath dataset, the Llama2 model showed sensational accuracy improvements in the GSM8K dataset at 316.44% and in the MATH dataset at 1067.6%. Another model fine-tuned on this dataset was CodeLlama, which also showed huge improvements: 120.58% in SVAMP and 1118.09% in SimulEq. These results show that, at the very least, InfinityMath can increase AI models’ accuracy and robustness and improve their reliability in solving various mathematical problems. This consistency was also ahead regarding logical outcomes due to numerical variations; traditional datasets often lack performance.

Therefore, The InfinityMath effect extends beyond mere numerical accuracy to strike at perhaps the most fundamental feature of mathematical reasoning. The authors performed strict, improved evaluations with existing test sets, such as GSM8K+ and MATH+, differing only in the numerical values. Models trained on InfinityMath showed higher performance in logical consistency than any other dataset in accuracy and model efficacy. This success underlines the role played by InfinityMath in further pushing the frontiers of mathematical reasoning and scaling and making an effective solution available to a very large class of AI models.

In other words, InfinityMath is a major improvement in mathematical reasoning, solving two major challenges: scalability and logical consistency. The dataset was curated by a dedicated research team from the Beijing Academy of Artificial Intelligence and the China University of Mining & Technology to ensure that a robust and highly extensible solution could ultimately allow AI models to solve extremely complex mathematical problems. In this case, the InfinityMath process not only separates numerical values from solving processes but also makes constructing a large, highly diversified dataset more efficient to enhance the accuracy and reliability of the AI models. These results thus enable gains in improvement to be witnessed with multiple benchmark-related performances. Therefore, this dataset could further improve AI and its applications in various fields.


Check out the Paper and Dataset. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter..

Don’t Forget to join our 48k+ ML SubReddit

Find Upcoming AI Webinars here


The post InfinityMath: A Scalable Instruction Tuning Dataset for Programmatic Mathematical Reasoning appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

InfinityMath 数学推理 可扩展性 逻辑一致性 AI模型
相关文章