Communications of the ACM - Artificial Intelligence 2小时前
Nonsense and Malicious Packages: LLM Hallucinations in Code Generation
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

人工智能(AI)在软件开发中的应用日益广泛,但大型语言模型(LLM)在代码生成时出现的“幻觉”(hallucinations)问题,即模型会编造信息以完成任务,正带来挑战。这些幻觉可能导致代码无法正常工作,浪费开发者时间,甚至引发网络安全威胁。研究人员正在积极探索解决之道,包括开发能自动修复代码的智能代理(如RepairAgent)和用于减少模型幻觉的技术(如De-Hallucinator),后者通过注入项目特定知识来优化模型输出。此外,社区也在努力分类和检测代码幻觉,并研究其对网络安全的潜在影响,例如“包幻觉”可能被恶意利用。应对策略包括改进模型训练、增强检索增强生成(RAG)与精调技术结合,以提高AI生成代码的准确性和安全性。

💻 **AI代码生成中的“幻觉”现象:** 大型语言模型(LLM)在生成代码时,为了完成指令,有时会“编造”信息,即产生“幻觉”。在代码生成领域,这种幻觉表现为生成无法运行或不符合实际需求的错误代码,这不仅浪费开发者的时间,还可能带来网络安全风险。

🔧 **缓解AI代码幻觉的技术与方法:** 研究人员正积极开发多种方法来应对代码幻觉。例如,RepairAgent等自主代理可以通过调用代码搜索工具等外部资源来修正错误代码,同时处理幻觉问题。De-Hallucinator技术则通过结合项目特定的API参考和迭代式信息注入,利用LLM幻觉的“可信”特性来识别和纠正错误,将LLM的这一“缺陷”转化为优势。

🛡️ **代码幻觉对网络安全的威胁:** “包幻觉”是模型生成代码时推荐或引用不存在的软件包,这为恶意攻击者提供了可乘之机。攻击者可以注册带有恶意代码的同名软件包,诱导开发者在不知情的情况下下载,从而发起“包混淆攻击”。研究发现,开源模型产生的包幻觉比例尤为突出且难以根除。

📈 **社区与研究的应对方向:** 为应对代码幻觉,研究人员正通过代码执行验证进行分类,并开发检测和量化工具(如CodeHalu)。同时,也提出了针对LLM性能的基准测试(如HALLUCODE)。应对策略还包括结合检索增强生成(RAG)与精选软件包列表及模型微调技术,以确保AI生成代码的准确性和安全性,使LLM更可靠地服务于软件开发。

The goal of generative AI tools, powered by large language models (LLMs), is to finish the task assigned to them; to provide a complete response to a prompt. As is now well-established, models sometimes make things up, or hallucinate, to achieve this. In natural language outputs, hallucinations have degrees of seriousness—minimal in shopping lists, possibly consequential in scientific texts. In code generation, hallucinations are easier to spot and the consequences are clear: the code doesn’t work, either properly or at all.

AI is already upending how developers work across multiple programing languages. According to Google’s 2024 The State of DevOps Report, 74.9% of survey respondents said they are already using AI in software development. Yet the thorny issue of hallucinations remains.

Researchers cognizant of the impact of hallucinations—from time wasted fixing errors to cybersecurity threats—are now devising mitigation methods.

Not Knowing Is Not an Option

“Models are trained to always say something. They are rarely trained to say, ‘Oh, I don’t know,’ which is what a human would sometimes do,” said Michael Pradel, a computer science professor and software engineering expert at the University of Stuttgart in Germany. “If the AI is suggesting code that is nonsense and doesn’t refer to the actual code base or libraries they are working with, it’s not super-helpful.”

Pradel said that when a model starts hallucinating, it can take a developer more time to fix the generated code than it would to write it manually. Companies that work on private code bases are especially vulnerable, he said, as popular LLMs trained on public data mined from online sources have not seen their code. He explained, “The chance to get hallucinations is much, much bigger because the models don’t know the facts about their code base.”

While companies with sufficient resources may develop their own models trained on their own code, for others “off the shelf” is the only option, said Pradel. “These off-the-shelf models just don’t know the code base of a small or medium-sized company.”

Agentic approaches, where LLM-based agents seek out and call APIs, code bases, and documentation, can help mitigate hallucinations. With co-authors Islem Bouzenia, also of the University of Stuttgart, and Premkumar Devanbu of the University of California Davis, Pradel developed RepairAgent, an autonomous agent that automatically fixes bugs by modifying code. “We do this is by letting the LLM actively call different tools, like code search tools— which has the nice benefit of also handling hallucinations,” Pradel said.

In another approach, Pradel and Ph.D. researcher Aryaz Eghbali have presented De-Hallucinator, a technique for mitigating LLM hallucinations within project-specific APIs by combining relevant API references with iterative grounding. “The idea of De-Hallucinator is to use this bad feature of LLMs to our advantage,” said Pradel.

The method exploits the fact that hallucinations often sound credible. When a model hallucinates an API, De-Hallucinator automatically identifies existing, project-specific APIs with similar names; it then iteratively augments the prompt with that name. Said Pradel, “We give this as additional context, we inject a little bit of project-specific knowledge into the prompt based on what the LLM has suggested in the first round.”

Pradel and Eghbali evaluated De-Hallucinator on code completion and test generation tasks using Python and JavaScript, and concluded that the technique “significantly improves the quality of generations over the state-of-the-art baselines.”

Coming to Grips with the Issue

Setting out to build community understanding of the challenges code hallucinations pose, a collaboration of U.S., Chinese, and Japanese researchers have presented a method of classifying them using execution verification. The researchers defined four main categories of hallucination: mapping, naming, resource, and logic, each of which has sub-categories. They have also developed a publicly available algorithm, CodeHalu, for detecting and quantifying code hallucinations, and CodeHaluEval, a benchmark for evaluating them.

Meanwhile, a team of researchers from Beihang University in Beijing, Shandong University in Qingdao, and Huawei Cloud Computing Technologies have used open coding and iterative refinements to develop a taxonomy of hallucinations that includes categories and sub-categories such as knowledge conflicts, inconsistency, repetition, and dead code. The work also features a benchmark, named HALLUCODE, for evaluating LLMs’ performance in recognizing code hallucinations.

An Unwanted Package

Hallucinations may also pose a viable threat to cybersecurity, according to a team of U.S.-based researchers. A team of researchers from the University of Texas at San Antonio, the University of Oklahoma, and Virginia Tech recently posited that package hallucinations—which occur when a model generates code that recommends or references a package that doesn’t exist—are ripe for exploitation by malicious attackers.

Code generation has structures that models must follow to generate functioning code, but that makes it easier for models to “pin themselves in a corner,” said lead author and Ph.D. cybersecurity researcher Joseph Spracklen. He explained that as there is a finite list of packages that exist, “It’s possible for the model to choose tokens in such a way that suddenly it says, ‘Oh wait, stop, there’s nothing I can pick from at this point that would lead to a valid package.’” Unable to express uncertainty, models are then forced to hallucinate package names to complete their task.

Spracklen and the other researchers analyzed 576,000 Python and JavaScript code samples generated by 16 commercial and open-source LLMs, finding an average of 5.2% hallucinated packages for the former and 21.7% for the latter. Said Spracklen, “Not only are these hallucinations prevalent, but they are persistent, and they are prone to being regenerated multiple times.”

The evidence of repetition in hallucinated names is particularly ominous for raising the chances of so-called “package confusion attacks.” According to Spracklen, once malicious actors have identified package hallucinations—perhaps by parsing for package names in LLM output and looking for those that do not exist in reality—they can register a package “laced with malicious code” under the hallucinated name on public repositories, such as PyPI [for Python] or npm [for JavaScipt]. Said Spracklen, “It’s very straightforward because these are open-source repositories, they’re anonymous.” Then the attacker would “just sit back and wait,” he continued.

Compromises occur when a developer, using an LLM as part of their normal workflow, unwittingly downloads the malicious package. “Similar to a phishing attack, they’re exploiting the trust that a user has in the model, and they are relying on the user not to do their due diligence,” Spracklen said.

Deploying Retrieval Augmented Generation (RAG) strategies, such as having an LLM check whether a package is on a master list, is not sufficient mitigation, said Spracklen. “The insidious thing about this vulnerability is that once a malicious actor publishes a malicious package, now that package is on the master list.” Instead, Spracklen proposes combining RAG with curated package lists and fine-tuning techniques. “You fine-tune your LLM based off of known good responses that it’s given in the past and known good packages,” he said.

Hallucinations are increasingly viewed as a feature of LLMs rather than a bug, and they appear—so far—to be baked in. However, users are adapting. Just as some content producers have pivoted towards fact-checking generated text rather than creating their own, some developers may find themselves increasingly playing the role of hallucination-fixer.

Karen Emslie is a location-independent freelance journalist and essayist.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI代码生成 大型语言模型 代码幻觉 网络安全 LLM
相关文章