EnterpriseAI 2024年12月06日
Shining a Light on AI Risks: Inside MLCommons’ AILuminate Benchmark
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

MLCommons推出了一个名为AILuminate v1.0的大语言模型安全基准,旨在评估通用大型语言模型的安全性和可靠性。该基准涵盖了12个危害类别,通过24000个测试提示来评估模型在面对恶意或脆弱用户提示时的安全性,例如是否提供不恰当的建议或生成不适宜的内容。AILuminate的目标是确保AI系统始终提供安全、负责任的响应,并通过可理解的评分体系帮助用户评估模型的安全性。尽管目前存在一些局限性,但MLCommons计划持续改进该基准,例如支持多语言和多模态模型,并探索区域扩展和偏见改进等方面。

🤔 **AILuminate v1.0是一个用于评估大型语言模型安全性的基准,由MLCommons发布。** 该基准旨在衡量LLM在面对恶意或脆弱用户提示时,是否会产生有害的响应,例如提供不恰当的建议或生成不适宜的内容。

⚠️ **基准涵盖12个危害类别,并使用超过24000个测试提示进行评估。** 这些提示模拟了各种潜在的危险场景,旨在全面评估模型的安全性能。

📊 **AILuminate提供了一个可理解的安全评分体系,帮助用户评估模型的安全性。** 该评分体系基于对参考模型集的比较,提供整体和特定危害类别的评分,帮助非专业人士理解模型的安全状况。

🌍 **MLCommons计划扩展AILuminate,支持更多语言和模态。** 未来,该基准将扩展到支持法语、中文和印地语等语言,并探索区域扩展和偏见改进等方面,以更好地满足不同地区的特定安全需求。

💡 **AILuminate旨在推动AI安全领域的进步,为AI系统的安全可靠性提供标准和参考。** MLCommons希望通过该基准,促进AI社区共同努力,构建更安全、更可靠的AI系统,确保AI技术能够为人类社会带来福祉。

As the world continues to navigate new pathways brought about by generative AI, the need for tools that can illuminate the risk and reliability of these systems has never felt more urgent. 

MLCommons is working to shine a light into the black box of AI with its new safety benchmark for large language models, AILuminate v1.0, developed by the MLCommons AI Risk & Reliability working group. 

Launched on Wednesday at a live-streamed event at the Computer History Museum in San Jose, the AILuminate v1.0 benchmark introduces a comprehensive safety testing framework for general-purpose LLMs, evaluating their performance across twelve hazard categories. MLCommons says the benchmark primarily measures the propensity of AI systems to respond in a hazardous manner to prompts from malicious or vulnerable users that might result in harm to themselves or others. 

MLCommons, an open engineering consortium, is best known for its MLPerf benchmark, which served as a catalyst for the organization's formation. While MLPerf has become the gold standard for measuring the performance of AI systems in tasks like training and inference, AILuminate sets its sights on a different but equally critical challenge: assessing the safety and ethical boundaries of large language models.  

The 12 hazards from malicious or vulnerable users. (Source: MLCommons)

During the launch event, founder and president of MLCommons Peter Mattson compared the current state of AI to the development of the automotive and aviation industries, highlighting how rigorous measurement and research in safety standardization have achieved the low risk and reliability we now take for granted. Mattson says there are barriers to cross to get there with AI. 

“For a long time, decades, AI was a bunch of very cool ideas that never quite worked. But now we've entered a new era, which I'm going to describe as the era of amazing research and scary headlines,” Mattson said. “And to get there, we had to break through a capability barrier. We did that with innovations like deep neural networks and Transformers and benchmarks like ImageNet. But today, we want to reach a third era, and that is the era of products and services that deliver real value to users, to businesses, and to society at large. In order to get there, we need to pass through another barrier, a risk and reliability barrier.” 

Much of AI safety research is focused on aspects of AI safety such as models becoming too advanced or autonomous, or the output or deployment of these systems causing economic or environmental risks, but AILuminate takes a different approach. 

“AI Luminate is aimed at what we describe as AI product safety,” Mattson said. “Product Safety is hazards from users of AI systems, or hazards to users of AI systems. Near-term, practical, business value oriented. That's product safety.” 

The goal of AILuminate is to ensure AI systems consistently provide safe, responsible responses rather than enabling harmful behavior, and the benchmark is designed to measure and improve this capability, Mattson explained.

(Source: MLCommons)

To do this, AILuminate establishes a standardized approach to safety assessment, featuring a detailed hazard taxonomy and response evaluation criteria. The benchmark includes over 24,000 test prompts—12,000 public practice prompts and 12,000 confidential Official Test prompts—designed to simulate distinct hazardous scenarios. The benchmark leverages an evaluation system powered by a tuned ensemble of safety evaluation models, providing public safety grades for more than 13 systems-under-test, both overall and for specific hazards. 

The benchmark was designed to test general-purpose systems in low-risk chat applications. It assesses whether the system inappropriately offers advice on high-risk topics, such as legal, financial, or medical matters, without recommending consultation with a qualified expert. Additionally, it examines whether the system generates sexually explicit content that is unsuitable in a general-purpose context. 

Another goal of the benchmark is accessibility. “Our goal is to develop a benchmark that not only checks these hazards, which produces a lot of useful information but distills that information into actionable grades, something that a nonexpert can actually understand and reason with,” Mattson said. 

AILuminate in its current form has some limitations, MLCommons says. It only measures English LLMs, not multi-modal models, and is capable of single prompt-response interactions only, meaning it may not capture longer, more complex interactions between users and AI systems.  There is also significant uncertainty in the testing of natural language systems due to temperature-based variability in model responses. Additionally, the grading is relative, not an absolute measure of safety, as it is based on comparing to a reference set of accessible models. 

AILuminate v1.0 is the start of an iterative development process, with the expectation of finding and fixing issues over time, Mattson said. “This is just the beginning. This is v1.0 and AI safety, even AI product safety is a huge space. We have ambitious plans for 2025.” 

MLCommons is developing multiple language support for next year, starting with French, Chinese, and Hindi. The consortium is also exploring regional extensions that could address safety concerns unique to various regions, as well as prompt improvements for specific hazards and ways of improving bias. 

“Together, we can make AI safer. We can define clear metrics. We can make progress on those metrics,” Mattson concluded. “We all see the potential of AI, but we also see the risks, and we want to do it right, and that's what we're trying to do with introducing this benchmark.” 

To learn more about AILuminate and view the current evaluation results, visit the website at this link. 

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AILuminate AI安全 大型语言模型 MLCommons 基准测试
相关文章