少点错误 02月27日
Why Can't We Hypothesize After the Fact?
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了科学方法中先假设后验证的必要性。传统的解释侧重于防止过度拟合,但作者认为更深层次的原因在于,即使在所有观测数据的基础上,符合数据的解释仍然是无限的,并且大多无法推广。而要求解释能够正确生成未见的数据点,则是一个非常严格的测试,能够有效筛选出具有泛化能力的解释,从而帮助我们更接近真相。这并非因为人类不可信,而是因为人类的认知能力有限,无法完美地评估理论的简洁性,因此需要借助宇宙的泛化能力来判断解释的优劣。

🧪 科学方法的核心在于“先假设,后验证”,这种方式并非简单地为了防止过度拟合,而是源于对人类认知局限性的深刻理解。

🧠 即使我们拥有所有的数据,能够解释这些数据的理论仍然是无限的。这意味着,大多数理论虽然能够解释已知现象,但却无法推广到未知领域。

🔥 验证假设的过程,尤其是用未见过的数据进行验证,就像一个严酷的筛选器。只有极少数的理论能够通过这个测试,而这些通过测试的理论,往往蕴含着关于现实的深刻洞察。

Published on February 26, 2025 10:41 PM GMT

When you have put a lot of ideas together to make an elaborate theory, you want to make sure, when explaining what it fits, that those things it fits are not just the things that gave you the idea for the theory; but that the finished theory makes something else come out right, in addition.

—Richard Feynman, "Cargo Cult Science"

Science as Not Trusting Yourself?

The first question I had when I learned the scientific algorithm in school was:

Why do we want to first hypothesize and only then collect data to test? Surely, the other way around—having all the data at hand first—would be more helpful than having only some of the data.

Later on, the reply I would end up giving out to this question was:

It's a matter of "epistemic discipline." As scientists, we don't really trust each other very much; I, and you, need your theory to pass this additional blinded check.

(And here I'd also maybe gesture at the concept of overfitting.)

I think in school we get the sense that "discipline" is of course intellectually good. And we're exposed to the standard stylized epicycles story, highlighting how silly ex post explanations can end up being.

But I'm not sure that's a head-on answer to the original question. I don't want to overfit into a wrong hypothesis, but why do I need to blind myself to data to avoid that? Why can't I just be careful and be sensitive to how silly my epicycles are, and discount accordingly?

(A little later on, I might have said something like: "AIXI by its nature will always make exactly the requisite Bayesian update over hypotheses, but we don't fully trust our own code and hardware to not fool ourselves." But that's kind of the same thing: Bayes gives you an overarching philosophy of inference, but this whole theorize-before-test thing remains seen merely as a patch for our human foibles.)

A Counting Argument About Science

Here's a maybe more satisfying characterization of the same thing, cutting more centrally at the Bayes structure connecting your brain to its complement. (To be perfectly clear, this isn't a novel insight but just a representation of the above in new language.)

Let an explanation be a mathematical object, a generator that spits out observations into a brain. Then, even when you condition on all your observations so far, most surviving generators still don't resemble reality. Like, if you want the polynomials of a degree that pass through a set of degree-many points, there are infinitely many such polynomials. Even if you insist that the polynomials don't wiggle "excessively," there are still infinitely many of them. You take reasonable care about excessive wiggling and choose a polynomial that doesn't generalize. Almost all your options don't generalize.

In contrast, it's super strenuous to require your generator to correctly generate an unseen prior datapoint. Not strenuous in a merely human sense: mathematically, almost all of the old generators fail this test and vanish. The test is so intense that it has burned some generalization ability into the survivors: every surviving generator is either thermodynamically lucky or is onto something about the ground truth generator.

It's not that humans are particularly epistemically corrupt and are never to be trusted until proven otherwise. It's that humans aren't epistemically perfect. If we were all perfect judges of theoretical simplicity, we could do as AIXI does and update after-the-fact. But it's perfectly reasonable to let the universe judge explanations on generalization ability in place of us weighing explanations by complexity. We don't stuff the whole prior over all possible explanations into our head precisely enough to put it on a scale, and that's fine—most beings embedded inside large physical worlds don't either.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

科学方法 假设验证 泛化能力
相关文章