EnterpriseAI 2024年10月24日
Spotting AI Hallucinations: MIT’s SymGen Speeds Up LLM Output Validation
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

MIT创建的新验证工具SymGen,能让LLM的回答附带出处,便于人类验证者追溯信息来源,提高AI回应的透明度和信任度。该工具可助验证者快速核实LLM的回应,还能让人们更有信心,且相比手动程序加快了约20%的验证时间。但目前它需要表格形式的结构化数据,研究人员正探索增强其能力以涵盖非结构化数据。

💡SymGen是MIT开发的新工具,能使LLM生成的回应附带直接指向特定源文档的引文,甚至精确到数据中的单元格,方便人类验证者追溯信息来源。

👀系统允许验证者将鼠标悬停在文本回应的突出显示部分,查看AI模型用于生成特定单词或短语的数据,未突出显示的部分则需更仔细审查。

🚀SymGen相比手动程序加快了约20%的验证时间,但目前它的当前版本需要表格形式的结构化数据,研究人员正探索使其能涵盖非结构化数据。

🌐像SymGen这样的工具是对抗AI幻觉的有希望的手段,如谷歌的DataGemma也是为提高LLM准确性而设计的系统。

Imagine if your LLM could not only provide answers but also show you exactly where those answers came from—like a scholar meticulously citing sources. A new validation tool created at MIT aims to do just that, giving human validators the ability to trace every piece of information back to its origin in a dataset, which could lead to greater transparency and trust in the AI's responses.

The new tool is called SymGen, developed by MIT researchers to aid human validators to quickly verify an LLM’s responses, according to reporting from MIT News. SymGen enables an LLM to generate responses with citations pointing directly to a specific source document, down to the cell in the data.

The system allows a validator to hover over highlighted portions of a text response to see the data an AI model used to generate a specific word or phrase, MIT News said, adding that unhighlighted portions show which phrases are unlinked to specific data and need to be scrutinized more closely.

“We give people the ability to selectively focus on parts of the text they need to be more worried about. In the end, SymGen can give people higher confidence in a model’s responses because they can easily take a closer look to ensure that the information is verified,” says Shannon Shen, an electrical engineering and computer science graduate student and co-lead author of a paper on SymGen, as quoted in MIT News.

Using generative AI models to interpret complex data can be a high-consequence endeavor, especially in fields like healthcare and finance, or in scientific applications where accuracy is essential. While LLMs can process vast amounts of data and generate responses quickly, they also frequently hallucinate, giving information that can sound plausible but is erroneous, biased, or imprecise.

Human validation is a key factor in improving LLM accuracy because it provides a critical layer of oversight that AI models often lack. Human validators help ensure the quality of the output by cross-referencing facts, identifying inconsistencies, and correcting errors that the model may overlook. This iterative process not only refines the LLM's performance but also helps address issues like hallucinations and misinformation, making the model more reliable and trustworthy over time.

Generating citations is nothing new for LLMs, but they often point to external documents and sorting through them can be time-consuming. The researchers said they approached the time problem from the perspective of the humans doing this tedious validation work: “Generative AI is intended to reduce the user’s time to complete a task. If you need to spend hours reading through all these documents to verify the model is saying something reasonable, then it’s less helpful to have the generations in practice,” Shen said.

It appears SymGen could help validators work more quickly. Shen and his team say SymGen sped up verification time by about 20 percent compared to manual procedures, according to their results from a user study.

Data quality continues to be a vital factor in validating LLM output, even with tools like SymGen. As always, an AI model's reliability hinges on the quality and credibility of the data it’s trained on. One caveat is that SymGen’s current iteration requires structured data in a tabular format. The researchers are exploring ways to augment SymGen’s capabilities to include unstructured data and other formats. MIT News also noted that researchers are planning to test SymGen with physicians to study how it could identify errors in AI-generated clinical summaries.

SymGen is another promising tool in the fight against hallucinations. Another example is Google’s recently launched DataGemma, a system designed to connect LLMs with extensive real-world data drawn from Google's Data Commons, a large repository of public data. DataGemma integrates Data Commons within Google’s Gemma family of lightweight open models and uses two techniques, retrieval-interleaved generation and retrieval-augmented generation, to enhance LLM accuracy and reasoning.

With exciting new tools like SymGen and DataGemma leading the charge, we may soon envision a future where AI hallucinations are nothing but a distant memory.

Read more about the technical features of SymGen in the original MIT News report by Adam Zewe, found here. You can also check out the accompanying academic paper at this link.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

SymGen LLM验证 信息追溯 AI准确性
相关文章