少点错误 01月02日
A pragmatic story about where we get our priors
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章探讨了先验从何而来的问题,提到该问题在LessWrong上引起哲学困惑,认为人类大脑由自然选择塑造,其提供的概率答案类似语言模型的预测。还指出若面临恶意分布变化或物理法则巨变,人类和语言模型的预测会失效。

🧠人类大脑由自然选择塑造,提供概率答案

📖语言模型在特定训练下的预测表现

⚠️面临恶意分布变化或法则巨变,预测会失效

🤔人类先验并非来自特定数学过程

Published on January 2, 2025 10:16 AM GMT

In Eliezer's 2004 essay, "An Intuitive Explanation of Bayes' Theorem", he puts forth that it's not clear where priors originally come from. Here's a dialogue from that post poking fun at that difficulty.

Q. How can I find the priors for a problem?

A. Many commonly used priors are listed in the Handbook of Chemistry and Physics.

Q. Where do priors originally come from?

A. Never ask that question.

Q. Uh huh. Then where do scientists get their priors?

A. Priors for scientific problems are established by annual vote of the AAAS. In recent years the vote has become fractious and controversial, with widespread acrimony, factional polarization, and several outright assassinations. This may be a front for infighting within the Bayes Council, or it may be that the disputants have too much spare time. No one is really sure.

Q. I see. And where does everyone else get their priors?

A. They download their priors from Kazaa.

Q. What if the priors I want aren’t available on Kazaa?

A. There’s a small, cluttered antique shop in a back alley of San Francisco’s Chinatown. Don’t ask about the bronze rat.

The problem of where priors originally come from has caused significant philosophical confusion on LessWrong, but I think it actually has a pretty clear naturalistic solution. Our brains supply the answers to questions of probability (e.g. "How likely is Donald Trump to win the 2024 presidential election?"), and our brains were shaped by natural selection. That is to say, they were shaped by a process which generates cognitive algorithms which are reproduced if they work in practice. It wasn't like we were being evaluated on how well our brains performed in every possible universe. They just had to produce well-calibrated expectations in our universe, or more accurately, just the parts of the universe they actually had to deal with.

You can build some intuition for this topic by considering large language models. Prior to undergoing reinforcement learning and becoming chatbots, LLMs are pure next-word predictors. If such an LLM had good enough training data and a good enough architecture, its outputs will tend to be pretty good as predictions of the types of things that humans actually say in real life.[1] However, it's not hard to contrive situations where an LLM base model's predictions fail dramatically.

For instance, if you type

Once upon a

into a typical base model, the vast majority of its probability mass falls on the token "time". However, you could easily continue training the model on documents which followed up "once upon a" with random noise tokens, and its predictions will completely fail on this new statistical distribution.

In this analogy, humans are like the language models, in the sense that we're both designed to predict that the future will broadly behave the same way as the past (in a well-defined statistical learning sense). In practice, this has historically worked out well for both humans and LLMs. They've historically proven well-calibrated as models of the world, giving them advantages which have resulted in their architectures being selected for replication and further improvement by natural selection and deep learning engineers respectively.

However, if an LLM was exposed to a malicious distributional shift, or if the laws of physics a human lived under suddenly underwent a massive apparent shift, both systems would completely stop working. It's impossible to rule this possibility out; indeed, it's impossible to even prove that it's unlikely, except by deferring to the very probabilistic systems whose calibrations would be thrown off by such a cataclysm. The best each system can do is keep working with its current learning algorithm, and hoping for the best.

Anyway, all of that's to say that there's a good chance that human priors don't come from any bespoke mathematical process which provably achieves relatively good results in all possible universes, however we'd want to define that. There's just some learning algorithm it runs which results in us having the ability to make natural-language statements about the probabilities of future events, which have turned out to be reasonably well-calibrated in practice.

The fact that this works comes down to the probably-inexplicable fact that we seem to live in a universe amenable enough to induction that natural selection could find an algorithm that has world well enough historically, as well as whatever the implementation details of the brain's learning algorithm actually are. I doubt that it's a reflection of some deep truth about the structure of probability theory.

(I have a more negative critique of why I don't think popular theories like "our brains approximate Solomonoff induction" are very enlightening or even coherent as explanations of where priors come from or ought to come from, but that seems like a topic for another post.)

  1. ^

    This can be rigorously quantified by using the LLM's loss function; see this video series if you're ignorant and curious about what that means.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

先验 自然选择 语言模型 概率
相关文章