MarkTechPost@AI 2024年09月24日
Trust-Align: An AI Framework for Improving the Trustworthiness of Retrieval-Augmented Generation in Large Language Models
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

TRUST-ALIGN是一个新框架,旨在提高大型语言模型(LLM)在检索增强生成(RAG)任务中的可靠性,通过对模型进行微调使其输出更准确,并以文档为依据进行回答。该框架还引入了一个新的评估指标TRUST-SCORE,用于评估模型在多个维度上的表现,包括判断问题是否可以从提供的文档中得到回答以及引用相关来源的准确性。

😊 **TRUST-ALIGN的优势:** 该框架通过微调LLM,利用包含19,000个问题-文档对的训练数据集,这些数据集中包含了来自像GPT-4这样的LLM的自然回答和源自常见幻觉的负面回答。这种方法的优势在于它能够直接优化LLM的行为,使其在必要时提供有根据的拒绝,确保模型只有在有足够信息的情况下才回答问题。它还通过指导模型引用文档中最相关的部分来提高模型的引用准确性,从而防止过度引用或不当归属。

🤔 **性能提升:** TRUST-ALIGN在多个基准数据集上显示出显著的性能提升。例如,在ASQA数据集上,与TRUST-ALIGN对齐的LLaMA-3-8b模型在TRUST-SCORE上提高了10.73%,超过了像GPT-4和Claude-3.5 Sonnet这样的模型。在QAMPARI数据集上,该方法的性能比基线模型高出29.24%,而ELI5数据集的性能提升了14.88%。这些数据表明,TRUST-ALIGN框架在生成更准确和可靠的响应方面比其他方法更有效。

🚀 **主要改进:** TRUST-ALIGN带来的主要改进之一是模型在正确拒绝可用文档不足时拒绝回答的能力。在ASQA上,拒绝指标提高了9.87%,而在QAMPARI上,提高了22.53%。ELI5进一步强调了拒绝能力,提高了5.32%。这些结果表明,该框架提高了模型的准确性,并显著降低了模型在没有从提供的文档中获得适当理由的情况下过度回答问题的倾向。

🌟 **引用质量提升:** TRUST-ALIGN在提高引用质量方面也取得了显著成就。在ASQA上,引用精度得分提高了26.67%,而在QAMPARI上,引用召回率提高了31.96%。ELI5数据集也显示出29.30%的提升。引用可靠性的提高确保了模型提供有充分支持的答案,使它们对于依赖基于事实的系统的用户来说更加可靠。

💡 **结论:** 这项研究解决了一个关键问题,即在现实世界应用中部署大型语言模型。通过开发TRUST-SCORE和TRUST-ALIGN框架,研究人员创建了一种可靠的方法,使LLM能够生成以文档为依据的响应,最大限度地减少幻觉,并提高整体可靠性。这种进步在准确性和提供有良好引用的信息的能力至关重要的领域尤其重要,为未来更可靠的AI系统铺平了道路。

Large language models (LLMs) have gained significant attention due to their potential to enhance various artificial intelligence applications, particularly in natural language processing. When integrated into frameworks like Retrieval-Augmented Generation (RAG), these models aim to refine AI systems’ output by drawing information from external documents rather than relying solely on their internal knowledge base. This approach is crucial in ensuring that AI-generated content remains factually accurate, which is a persistent issue in models not tied to external sources.

A key problem faced in this area is the occurrence of hallucinations in LLMs—where models generate seemingly plausible but factually incorrect information. This becomes especially problematic in tasks requiring high accuracy, such as answering factual questions or assisting in legal and educational fields. Many state-of-the-art LLMs rely heavily on parametric knowledge information learned during training, making them unsuitable for tasks where responses must strictly come from specific documents. To tackle this issue, new methods must be introduced to evaluate and improve the trustworthiness of these models.

Traditional methods focus on evaluating the end results of LLMs within the RAG framework, but few explore the intrinsic trustworthiness of the models themselves. Currently, approaches like prompting techniques align the models’ responses with document-grounded information. However, these methods often fall short, either failing to adapt the models or resulting in overly sensitive outputs that respond inappropriately. Researchers identified the need for a new metric to measure LLM performance and ensure that the models provide grounded, trustworthy responses based solely on retrieved documents.

Researchers from the Singapore University of Technology and Design, in collaboration with DSO National Laboratories, introduced a novel framework called “TRUST-ALIGN.” This method focuses on enhancing the trustworthiness of LLMs in RAG tasks by aligning their outputs to provide more accurate, document-supported answers. The researchers also developed a new evaluation metric, TRUST-SCORE, which assesses models based on multiple dimensions, such as their ability to determine whether a question can be answered using the provided documents and their precision in citing relevant sources.

TRUST-ALIGN works by fine-tuning LLMs using a dataset containing 19,000 question-document pairs, each labeled with preferred and unpreferred responses. This dataset was created by synthesizing natural responses from LLMs like GPT-4 and negative responses derived from common hallucinations. The key advantage of this method lies in its ability to directly optimize LLM behavior toward providing grounded refusals when necessary, ensuring that models only answer questions when sufficient information is available. It improves the models’ citation accuracy by guiding them to reference the most relevant portions of the documents, thus preventing over-citation or improper attribution.

Regarding performance, the introduction of TRUST-ALIGN showed substantial improvements across several benchmark datasets. For example, when evaluated on the ASQA dataset, LLaMA-3-8b, aligned with TRUST-ALIGN, achieved a 10.73% increase in the TRUST-SCORE, surpassing models like GPT-4 and Claude-3.5 Sonnet. On the QAMPARI dataset, the method outperformed the baseline models by 29.24%, while the ELI5 dataset showed a performance boost of 14.88%. These figures demonstrate the effectiveness of the TRUST-ALIGN framework in generating more accurate and reliable responses compared to other methods.

One of the significant improvements brought by TRUST-ALIGN was in the models’ ability to refuse to answer when the available documents were insufficient correctly. On ASQA, the refusal metric improved by 9.87%, while on QAMPARI, it showed an even higher increase of 22.53%. The ability to refuse was further highlighted in ELI5, where the improvement reached 5.32%. These results indicate that the framework enhanced the models’ accuracy and significantly reduced their tendency to over-answer questions without proper justification from the provided documents.

Another noteworthy achievement of TRUST-ALIGN was in improving citation quality. On ASQA, the citation precision scores rose by 26.67%, while on QAMPARI, citation recall increased by 31.96%. The ELI5 dataset also showed an improvement of 29.30%. This improvement in citation groundedness ensures that the models provide well-supported answers, making them more trustworthy for users who rely on fact-based systems.

In conclusion, this research addresses a critical issue in deploying large language models in real-world applications. By developing TRUST-SCORE and the TRUST-ALIGN framework, researchers have created a reliable method to align LLMs toward generating document-grounded responses, minimizing hallucinations, and improving overall trustworthiness. This advancement is particularly significant in fields where accuracy and the ability to provide well-cited information are paramount, paving the way for more reliable AI systems in the future.


Check out the Paper and GitHub page. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter..

Don’t Forget to join our 50k+ ML SubReddit

FREE AI WEBINAR: ‘SAM 2 for Video: How to Fine-tune On Your Data’ (Wed, Sep 25, 4:00 AM – 4:45 AM EST)

The post Trust-Align: An AI Framework for Improving the Trustworthiness of Retrieval-Augmented Generation in Large Language Models appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

大型语言模型 检索增强生成 TRUST-ALIGN AI可靠性 幻觉
相关文章