MarkTechPost@AI 2024年10月08日
From Fixed to Random Designs: Unveiling the Hidden Factor Behind Modern Machine Learning ML Phenomena
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章探讨现代机器学习现象,如双重下降和良性过拟合等,挑战了传统统计直觉。研究指出这些现象不仅在深度学习中出现,在简单模型中也存在。剑桥大学的研究者提出新观点,认为从固定设计到随机设计的转变是关键因素,这改变了经典的偏差-方差权衡,为理解当代机器学习的学习和泛化提供了有价值的见解。

🧐现代机器学习现象挑战传统统计直觉,如高度过参数化的ML模型训练至零损失的表现与传统模型复杂度和泛化的观念相矛盾。

📚多种研究试图解开现代ML现象的复杂性,包括在简单模型中也出现的良性插值和双重下降,以及对偏差-方差权衡的重新审视和对插值模型的分类。

🔍剑桥大学研究者提出从固定到随机设计的转变是关键因素,以k-NN estimators为例说明偏差和方差的意外行为并非仅局限于复杂现代ML方法,且在随机设计中,传统的‘方差随模型复杂度增加而增加,偏差随其减少’的直觉不一定成立。

💡研究者的分析表明传统偏差-方差权衡直觉在样本外预测中失效,即使对于简单估计器和数据生成过程也是如此,且存在模型复杂度降低时偏差和方差也降低的情况。

✨研究发现从固定到随机设计的转变从根本上改变了经典的偏差-方差权衡,为当代机器学习的学习和泛化提供了有价值的见解。

Modern machine learning (ML) phenomena such as double descent and benign overfitting have challenged long-standing statistical intuitions, confusing many classically trained statisticians. These phenomena contradict fundamental principles taught in introductory data science courses, especially overfitting and the bias-variance tradeoff. The striking performance of highly overparameterized ML models trained to zero loss contradicts conventional wisdom about model complexity and generalization. This unexpected behavior raises critical questions about the continued relevance of traditional statistical concerns and whether recent developments in ML represent a paradigm shift or reveal previously overlooked approaches to learning from data.

Various researchers have attempted to unravel the complexities of modern ML phenomena. Studies have shown that benign interpolation and double descent are not limited to deep learning but also occur in simpler models like kernel methods and linear regression. Some researchers have revisited the bias-variance tradeoff, noting its absence in deep neural networks and proposing updated decompositions of prediction error. Others have developed taxonomies of interpolating models, distinguishing between benign, tempered, and catastrophic behaviors. These efforts aim to bridge the gap between classical statistical intuitions and modern ML observations, providing a more comprehensive understanding of generalization in complex models.

A researcher from the University of Cambridge has presented a note to understand the discrepancies between classical statistical intuitions and modern ML phenomena such as double descent and benign overfitting. While previous explanations have focused on the complexity of model ML methods, overparameterization, and higher data dimensionality, this study explores a simpler yet often overlooked reason for the observed behaviors. The researchers highlight that statistics historically focused on fixed design settings and in-sample prediction error, whereas modern ML evaluates performance based on generalization error and out-of-sample predictions.

The researchers explore how moving from fixed to random design settings affects the bias-variance tradeoff. The k-nearest Neighbor (k-NN) estimators are used as a simple example to show that surprising behaviors in bias and variance are not limited to complex modern ML methods. Moreover, in the random design setting, the classical intuition that “variance increases with model complexity, while bias decreases” does not necessarily hold. This is because bias no longer monotonically decreases as complexity increases. The key insight is that there is no perfect match between training points and new test points in random design, meaning that even the simplest models may not achieve zero bias. This fundamental difference challenges the traditional understanding of the bias-variance tradeoff and its implications for model selection.

The researchers’ analysis shows that the traditional bias-variance tradeoff intuition breaks down in out-of-sample predictions, even for simple estimators and data-generating processes. While the classical notion that “variance increases with model complexity, and bias decreases” holds for in-sample settings, it doesn’t necessarily apply to out-of-sample predictions. Moreover, there are scenarios where bias and variance decrease as model complexity is reduced, contradicting conventional wisdom. This observation is crucial for understanding phenomena like double descent and benign overfitting. The researchers emphasize that overparameterization and interpolation alone are not responsible for challenging textbook principles.

In conclusion, the researcher from the University of Cambridge highlights a crucial yet often overlooked factor in the emergence of seemingly counterintuitive modern ML phenomena: the shift from evaluating model performance based on in-sample prediction error to generalization to new inputs. This transition from fixed to random designs fundamentally alters the classical bias-variance tradeoff, even for simple k-NN estimators in under-parameterized regimes. This finding challenges the idea that high-dimensional data, complex ML estimators, and over-parameterization are only responsible for these surprising behaviors. This research provides valuable insights into the learning and generalization in contemporary ML landscapes.


Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter.. Don’t Forget to join our 50k+ ML SubReddit

Interested in promoting your company, product, service, or event to over 1 Million AI developers and researchers? Let’s collaborate!

The post From Fixed to Random Designs: Unveiling the Hidden Factor Behind Modern Machine Learning ML Phenomena appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

现代机器学习 偏差-方差权衡 固定与随机设计 剑桥大学研究
相关文章