MarkTechPost@AI 2024年07月15日
Can We Teach Transformers Causal Reasoning? This AI Paper Introduces Axiomatic Training: A Principle-Based Approach for Enhanced Causal Reasoning in AI Models
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

人工智能领域的一项重大挑战是如何训练模型进行因果推理。传统的训练方法依赖于包含明确标记的因果关系的大型数据集,这在获取和构建方面存在很大困难。来自微软研究院、印度理工学院海得拉巴分校和麻省理工学院的研究人员提出了一种名为“公理化训练”的新方法,该方法通过训练模型理解和应用因果公理,而不是依赖于数据驱动的统计模式。公理化训练侧重于模型对因果关系的基本原理的理解,而不是单纯地依赖大量数据,从而提高了模型在面对未知或复杂因果结构时的泛化能力。

🤔 **公理化训练:** 为了克服传统因果推理训练方法的局限性,研究人员提出了公理化训练,通过训练模型理解和应用因果公理来提高因果推理能力。这种方法不再依赖于大量标记因果关系的数据集,而是通过提供因果公理的示例,让模型学习因果关系的基本原理。

💡 **公理示例:** 作为公理化训练的例子,研究人员使用了传递性公理,即如果A导致B,B导致C,那么A应该导致C。他们通过生成包含不同噪音和顺序变化的线性因果链,为模型提供了多样化的训练数据。这种训练方式旨在使模型能够将学到的公理应用于更复杂和更大型的因果图,即使这些图在训练期间没有出现过。

💪 **显著效果:** 研究人员使用6700万个参数的Transformer模型进行实验,结果表明,经过公理化训练的模型在面对更长的因果链、反转的序列和复杂的树状结构时,展现了出色的泛化能力。该模型在标准链和随机翻转的链(长度为14-15)上的准确率分别达到了0.85和0.78,超越了GPT-4和Gemini Pro等大型模型。

🚀 **未来展望:** 公理化训练方法为提升AI模型的因果推理能力提供了新的思路。通过让模型理解因果公理,AI系统将能够更有效地处理复杂的因果结构,并在各种因果推理任务中展现出更强大的能力。这项研究为未来AI领域的研究和应用指明了方向,强调了基于原理的训练方法相对于传统数据密集型方法的优势。

👏 **研究成果:** 研究人员在论文中展示了公理化训练方法的有效性,并取得了显著的成果。该方法为AI模型的因果推理能力提供了新的方向,也为未来AI领域的研究和应用提供了宝贵的参考。

Artificial intelligence (AI) has transformed traditional research, propelling it to unprecedented heights. However, it has a ways to go regarding other spheres of its application. A critical issue in AI is training models to perform causal reasoning. Traditional methods heavily depend on large datasets with explicitly marked causal relationships, which are often expensive and challenging to obtain. Researchers aim to find innovative approaches to train AI models to comprehend and apply causal reasoning using more accessible data sources. This problem is pivotal as it directly impacts the efficiency and accuracy of AI systems in understanding and reasoning about cause-and-effect relationships in various applications.

Existing AI models typically use vast datasets where causal relationships are explicitly indicated or inferred through statistical patterns. For instance, large language models (LLMs) like GPT-4 have demonstrated some capabilities in causal reasoning. However, these models often need help with unseen or complex causal structures. Current methods include direct intervention data or pre-training models on datasets rich in causal information. Despite these efforts, the limitations remain significant, especially regarding the models’ ability to generalize across different causal scenarios.

Researchers from Microsoft Research, IIT Hyderabad, and MIT have introduced a novel method called axiomatic training to tackle these challenges. This approach involves training models on multiple demonstrations of causal axioms or rules rather than relying solely on inductive biases or inferred data values. By exposing AI models to various examples of these hypotheses, the researchers aim to enhance the models’ ability to generalize causal reasoning to new and more complex scenarios. This method is particularly innovative as it shifts the focus from data-intensive training to a more principle-based approach.

The axiomatic training approach devised by the research team involves generating diverse training data that includes multiple demonstrations of a causal axiom. For example, the transitivity axiom is utilized, where if A causes B and B causes C, then A should cause C. To improve their generalization capabilities, the models were trained on linear causal chains with variations, including noise and reversed orders. This comprehensive training aims to enable models to apply learned axioms to larger and more intricate causal graphs, even those not encountered during training. The researchers designed different evaluation sets to test the models’ abilities, encompassing causal sequences with lengths beyond the training data and sequences with shuffled orders to assess structural understanding and application of the transitivity axiom to more complex networks.

The performance and results of this research are remarkable. A 67 million parameter transformer model, trained using axiomatic demonstrations, showed exceptional generalization capabilities. It could extend its understanding to longer causal chains, reversed sequences, and complex branching structures, even outperforming larger models like GPT-4 and Gemini Pro in specific tests. For instance, the model achieved an accuracy rate of 0.85 for standard chains and 0.78 for randomly flipped chains of lengths 14-15. These results highlight the model’s ability to handle unseen scenarios effectively. Furthermore, the model demonstrated competitive performance compared to GPT-4, with substantial accuracy in causal chains of sizes 7-13, surpassing other LLMs like Gemini Pro and Phi-3 in various tasks.

To conclude, the research emphasizes the potential of axiomatic training in enhancing AI models’ causal reasoning abilities. By training models on fundamental causal axioms, researchers demonstrated that AI could effectively navigate complex causal structures. This method offers a more efficient and scalable approach to teaching causal reasoning, potentially transforming how AI systems are trained for causal inference tasks. The success of this method indicates a promising direction for future research and applications in AI, highlighting the importance of principle-based training over traditional data-intensive methods.


Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter

Join our Telegram Channel and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our 46k+ ML SubReddit

The post Can We Teach Transformers Causal Reasoning? This AI Paper Introduces Axiomatic Training: A Principle-Based Approach for Enhanced Causal Reasoning in AI Models appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

公理化训练 因果推理 AI模型 Transformer 泛化能力
相关文章