Communications of the ACM - Artificial Intelligence 03月04日
Explaining AI Explainability
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

欧盟AI法案要求AI产品结论具备透明性,这推动了解释AI决策过程的需求。克罗地亚萨格勒布大学的研究人员提出了一种新的方法,通过改进Layer-wise Relevance Propagation (LRP),即相对绝对幅度分层相关性传播(absLRP),来解释黑盒模型的决策。该方法结合了忠实性、鲁棒性、定位性和对比性,使用Global Attribution Evaluation (GAE)指标,以单一可追溯的度量标准衡量输入对最终输出的贡献,从而提高AI系统的透明度和可信度。这项研究对于减少AI应用中的偏见和刻板印象风险至关重要。

🌍欧盟AI法案要求商业AI产品具备透明性,促使研究人员致力于解决AI可解释性问题,以确保AI系统的可信赖性。

💡萨格勒布大学的研究人员提出了一种改进的Layer-wise Relevance Propagation (LRP)方法,称为Relative Absolute Magnitude Layer-Wise Relevance Propagation (absLRP),通过规范化消除抵消效应,从而更准确地追踪AI决策过程。

📊Global Attribution Evaluation (GAE)方法将忠实性、鲁棒性、定位性和对比性结合起来,以单一可追溯的指标衡量输入对模型输出的贡献,简化了AI可解释性的评估。

Explaining how artificially intelligent (AI) neural networks (NNs) arrive at their conclusions—especially deep neural networks (DNNs)—is still a scientific challenge.

Nevertheless, the European Union AI Act of 2024 now requires that commercial AI-based products’ conclusions be made “transparent”—that is, able to explain how they arrived at those conclusions.

The first challenge then, according to experts, is explaining AI explainability—a problem that has just recently come closer to solution thanks to a recent paper by post-doctoral researcher Davor Vukadin and colleagues at the University of Zagreb, Croatia, published in ACM Transactions on Intelligent Systems and Technology.

“Explainability plays a crucial role in ensuring that AI systems are transparent and trustworthy,” according to Gerhard Schimpf, a co-author of “The EU AI Act and the Wager on Trustworthy AI.” Schmipf, who was not involved in the work of Vukadin et al, said in a recent CACM video that the EU AI Act “has made groundbreaking regulations in areas where AI interacts with humans.”

However, the EU law requiring transparency is just a prelude to a worldwide explainability requirement for trustworthy critical AI, which is also being fiercely sought by the U.S., U.K., Japan, Canada, China, and the Group of Seven (G7) advanced economies, according to Schimpf et al.

Said Klemo Vladimir at the University of Zagreb, who was not involved in either aforementioned paper, “As AI continues to play an increasingly important role in decision-making, it is vital to ensure model transparency to reduce the risk of biases and stereotypes that may erode trust in these systems. [Vukadin et al.] introduces an innovative method for comprehensively explaining the decisions made by black-box models, which are now widely used in AI applications.”

How So?

Before the recent “explainability” requirement for commercial AI products, engineers of DNNs had been undaunted in creating more complex black boxes, adding millions or even billions of parameters (neurons and synapses) in ever-deeper (more layered) neural networks. The results have been astoundingly successful in achieving more precise results. Nevertheless, these larger and larger DNNs have gotten further and further away from explainability, since the more middle layers added, the more obscure the multifaceted determinations become.

That was no problem for non-critical applications such as natural language understanding, where small errors are tolerated. However, the AI community has started using DNNs for applications that demand explainability, such as explaining to someone why an AI just denied their loan application. Early efforts to explain why the inner layers of a DNN forced the final layer to come to a particular conclusion were difficult to automate, and thus were usually handled by human-invented heuristics similar to the reasons a human decision maker might cite. These fallback explanations sometimes didn’t even attempt to technically solve the “black box” explainability problem, while others tried to sidestep the issue by building in transparency during DNN construction with layer-wise activation tracing, representation vectors, and attention tracking, all of which slow down DNN development, reduces their performance, and often don’t make the DNNs sufficiently transparent anyway, according to Vukadin.

Now, after a decade of research, progress is being made in explaining AI explainability, according to the University of Zagreb researchers.

“Our method determines the importance and polarity [for or against] each input feature in relation to a model’s output. For instance, in the context of a DNN being used for personal loan decisions, our approach can effectively highlight key input features such as credit score, income, or debt-to-income ratio, and quantify the relative influence of each feature on the model’s final decision. This allows all parties to easily inspect and understand the reasoning behind the AI system’s outputs, fostering trust and transparency,” said Vukadin.

Using their explanation of explainability, Vukadin et al. proposed a new and improved version of Layer-wise Relevance Propagation (LRP) previously proposed by Technische Universität Berlin professor Sebastian Bach et al. in a 2015 paper. Bach et al. used Relative Attributing Propagation (RAP) as its Layer-wise Relevance Propagation rule (to trace explainability back from outputs to inputs), which was only partially successful, according to critics at Kyungpook National University (South Korea), namely Woo-Jeoung Nam et al. in “Relative Attributing Propagation: Interpreting the Comparative Contributions of Individual Units in Deep Neural Networks.” According to Nam, Bach’s Relative Attributing Propagation algorithm fails in instances when a neuron receives large activation values of conflicting polarity from its previous layer (that is, one large positive value from one previous layer neuron and one large negative value from a different previous layer neuron), claiming the conflicting values tend to cancel out when tracing backwards the explainability factor.

In response, Vukadin proposed a new Layer-wise Relevance Propagation calculation called Relative Absolute Magnitude Layer-Wise Relevance Propagation (absLRP), which uses normalization to eliminate the cancellation effect—an idea inspired by University of Munich senior research scientist Jindong Gu et al. in 2019’s “Understanding individual decisions of CNNs via contrastive backpropagation.” As such, absLRP uses the absolute final output of each neuron on a layer as a normalizing factor—a function which Vukadin claims can be efficiently implemented in any neural network library containing an automatic differentiation function (see code examples in GitHub). With normalization implemented, the cancelling effect is avoided when tracing explainability backwards through a complex DNN, according to Gu et al.

According to Vukadin, no other research group has combined these secret-sauce ingredients into a single metric to measure the explainability that is traceable through the layers of a DNN—in other words, to explain AI explainability. Other researchers have quantified the various factors when tracing explainability, such as faithfulness, robustness, and localization, but only as separate measurements, whereas Vukadin’s metric—absLRP—measures the Global Attribution Evaluation (GAE) of the input’s contribution to the final output with a single traceable metric.

In more detail, the GAE method measures absLRP by adding one more factor to the faithfulness, robustness, and localization measured by other researchers—namely contrastiveness, inspired by Anna Arias-Duart and several colleagues at Spain’s Barcelona Supercomputing Center in the 2022 paper “Focus! Rating XAI Methods and Finding Biases.” First Vukadin uses a gradient-descent method for determining faithfulness, robustness, and localization (which are together normalized to between -1 and +1). The additional calculation, contrastiveness, makes the final absLRP method comprehensive over all DNN layers, according to Vukadin. The GAE method then obtains the final comprehensive absLRP value by simply multiplying the normalized value by its contrastiveness value—thus taking both factors into account.

Visualization of explainability traced back to the input layer of a DNN classifying a bird (left) and dog (right) using the Global Attribution Evaluation (GAE) method, which measures an input’s contribution to a model’s final output with a single traceable metric.

“Our new evaluation method—GAE—provides a fresh perspective on assessing the faithfulness, robustness, localization, and contrastiveness of attribution traces, whereas previous approaches employed each metric separately without a clear methodology for combining them into a single score. GAE, on the other hand, greatly simplifies the comparison by providing a single comprehensive score,” said Vukadin.

The researchers performed comparison experiments against competing methods (VGG, ResNet, and ViT-Base) and provided extensive data that shows their method demonstrates GAE superiority on two widely used image classification datasets.

R. Colin Johnson is a Kyoto Prize Fellow who ​​has worked as a technology journalist ​for two decades.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI透明性 神经网络 可解释性AI
相关文章