cs.AI updates on arXiv.org 07月15日 12:24
An Epistemic and Aleatoric Decomposition of Arbitrariness to Constrain the Set of Good Models
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

研究指出机器学习模型对训练过程中的微小变化敏感,导致预测的不稳定性和不一致性。本文通过分解稳定性为认知和随机成分,提出了一种评估模型稳定性的方法,并提出了一个包含认知和随机标准的模型选择程序,以减少系统的不确定性。

arXiv:2302.04525v2 Announce Type: replace-cross Abstract: Recent research reveals that machine learning (ML) models are highly sensitive to minor changes in their training procedure, such as the inclusion or exclusion of a single data point, leading to conflicting predictions on individual data points; a property termed as arbitrariness or instability in ML pipelines in prior work. Drawing from the uncertainty literature, we show that stability decomposes into epistemic and aleatoric components, capturing the consistency and confidence in prediction, respectively. We use this decomposition to provide two main contributions. Our first contribution is an extensive empirical evaluation. We find that (i) epistemic instability can be reduced with more training data whereas aleatoric instability cannot; (ii) state-of-the-art ML models have aleatoric instability as high as 79% and aleatoric instability disparities among demographic groups as high as 29% in popular fairness benchmarks; and (iii) fairness pre-processing interventions generally increase aleatoric instability more than in-processing interventions, and both epistemic and aleatoric instability are highly sensitive to data-processing interventions and model architecture. Our second contribution is a practical solution to the problem of systematic arbitrariness. We propose a model selection procedure that includes epistemic and aleatoric criteria alongside existing accuracy and fairness criteria, and show that it successfully narrows down a large set of good models (50-100 on our datasets) to a handful of stable, fair and accurate ones. We built and publicly released a python library to measure epistemic and aleatoric multiplicity in any ML pipeline alongside existing confusion-matrix-based metrics, providing practitioners with a rich suite of evaluation metrics to use to define a more precise criterion during model selection.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

机器学习 模型稳定性 不确定性评估 模型选择 数据公平性
相关文章