TechCrunch News 2024年12月04日
AWS’ new service tackles AI hallucinations
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

亚马逊云科技(AWS)发布了一款名为Automated Reasoning checks的新工具,旨在解决生成式AI模型出现的“幻觉”问题。该工具通过交叉引用客户提供的信息来验证模型的响应准确性,并提供正确的答案以供参考。此外,AWS还推出了Model Distillation和多代理协作等功能,分别用于模型蒸馏和大型项目的AI任务分配。这些新功能旨在帮助客户更有效地利用生成式AI,但也存在一些限制,例如Model Distillation仅适用于Anthropic和Meta的模型,并且蒸馏后的模型会损失一些准确性。

🤔 **Automated Reasoning checks:**AWS推出的新工具,旨在通过交叉引用客户提供的信息来验证AI模型的响应准确性,从而减少AI模型“幻觉”现象。它能够识别模型可能出现的错误,并提供正确的答案,帮助用户了解模型的可靠性。

🔄 **Model Distillation:**AWS推出的模型蒸馏工具,可以将大型模型的能力转移到更小、更经济高效的模型中,例如将Llama 405B的能力转移到Llama 8B。这使得用户可以更便捷地使用不同模型进行实验,而无需承担高昂的成本。

🤝 **多代理协作:**Bedrock Agents的新功能,允许用户将AI分配到大型项目中的子任务中,例如审查财务记录和评估全球趋势。用户可以指定一个“主管代理”来分配任务并管理AI,确保各个AI能够协同工作,最终合成结果。

⚠️ **AI模型“幻觉”的本质:**AI模型并非真正“知道”任何东西,它们只是基于数据模式进行预测,因此其输出结果存在一定的误差。消除AI幻觉如同试图从水中去除氢一样困难。

📊 **Bedrock客户增长:**AWS Bedrock在过去一年中客户数量增长了4.7倍,达到数万名客户。这表明市场对生成式AI的需求正在不断增长,也反映出AWS在该领域的技术实力和市场影响力。

Amazon Web Services (AWS), Amazon’s cloud computing division, is launching a new tool to combat hallucinations — that is, scenarios where an AI model behaves unreliably.

Announced at AWS’ re:Invent 2024 conference in Las Vegas, the service, Automated Reasoning checks, validates a model’s responses by cross-referencing customer-supplied info for accuracy. AWS claims in a press release that Automated Reasoning checks is the “first” and “only” safeguard for hallucinations.

But that’s, well… putting it generously.

Automated Reasoning checks is nearly identical to the Correction feature Microsoft rolled out this summer, which also flags AI-generated text that might be factually wrong. Google also offers a tool in Vertex AI, its AI development platform, to let customers “ground” models by using data from third-party providers, their own data sets, or Google Search.

In any case, Automated Reasoning checks, which is available through AWS’ Bedrock model hosting service (specifically the Guardrails tool), attempts to figure out how a model arrived at an answer — and discern whether the answer is correct. Customers upload info to establish a ground truth of sorts, and Automated Reasoning checks and creates rules that can then be refined and applied to a model.

As a model generates responses, Automated Reasoning checks verifies them, and, in the event of a probable hallucination, draws on the ground truth for the right answer. It presents this answer alongside the likely mistruth so customers can see how far off-base the model might’ve been.

AWS says PwC is already using Automated Reasoning checks to design AI assistants for its clients. And Swami Sivasubramanian, VP of AI and data at AWS, suggested that this type of tooling is exactly what’s attracting customers to Bedrock.

“With the launch of these new capabilities,” he said in a statement, “we are innovating on behalf of customers to solve some of the top challenges that the entire industry is facing when moving generative AI applications to production.” Bedrock’s customer base grew by 4.7x in the last year to tens of thousands of customers, Sivasubramanian added.

But as one expert told me this summer, trying to eliminate hallucinations from generative AI is like trying to eliminate hydrogen from water.

AI models hallucinate because they don’t actually “know” anything. They’re statistical systems that identify patterns in a series of data, and predict which data comes next based on previously-seen examples. It follows that a model’s responses aren’t answers, then, but predictions of how questions should be answered — within a margin of error.

AWS claims that Automated Reasoning checks uses “logically accurate” and “verifiable reasoning” to arrive at its conclusions. But the company volunteered no data showing that the tool is itself reliable.

In other Bedrock news, AWS this morning announced Model Distillation, a tool to transfer the capabilities of a large model (e.g. Llama 405B) to a small model (e.g. Llama 8B) that’s cheaper and faster to run. An answer to Microsoft’s Distillation in Azure AI Foundry, Model Distillation provides a way to experiment with various models without breaking the bank, AWS says.

Image Credits:Frederic Lardinois/TechCrunch

“After the customer provides sample prompts, Amazon Bedrock will do all the work to generate responses and fine-tune the smaller model,” AWS explained in a blog post, “and it can even create more sample data, if needed, to complete the distillation process.”

But there’s a few caveats.

Model Distillation only works with Bedrock-hosted models from Anthropic and Meta at present. Customers have to select a large and small model from the same model “family” — the models can’t be from different providers. And distilled models will lose some accuracy — “less than 2%,” AWS claims.

If none of that deters you, Model Distillation is now available in preview, along with Automated Reasoning checks.

Also available in preview is “multi-agent collaboration,” a new Bedrock feature that lets customers assign AI to subtasks in a larger project. A part of Bedrock Agents, AWS’ contribution to the AI agent craze, multi-agent collaboration provides tools to create and tune AI to things like reviewing financial records and assessing global trends.

Customers can even designate a “supervisor agent” to break up and route tasks to the AIs automatically. The supervisor can “[give] specific agents access to the information they need to complete their work,” AWS says, and “[determine] what actions can be processed in parallel and which need details from other tasks before [an] agent can move forward.”

“Once all of the specialized [AIs] complete their inputs, the supervisor agent [can pull] the information together [and] synthesize the results,” AWS wrote in the post.

Sounds nifty. But as with all these features, we’ll have to see how well it works when deployed in the real world.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AWS AI幻觉 生成式AI Bedrock 模型蒸馏
相关文章