Notes on Occam via Solomonoff vs. hierarchical Bayes

Published on February 10, 2025 5:55 PM GMT

Crossposted from my Substack.

Intuitively, simpler theories are better all else equal. It also seems like finding a way to justify assigning higher prior probability to simpler theories is one of the more promising ways of approaching the problem of induction. In some places, Solomonoff induction (SI) seems to be considered the ideal way of encoding a bias towards simplicity. (Recall: under SI, hypotheses are computable functions that spit out observations. Hypothesis h gets prior probability proportional to 2^{-K(h; L)}, where K(h; L) is the hypothesis’ Kolmogorov complexity in language L.)

But I find SI pretty unsatisfying on its own, and think there might be a better approach (not original to me) to getting a bias towards simpler hypotheses in a Bayesian framework.

Simplicity via hierarchical Bayes

hierarchically-structured prior

Ptolemaic astronomy

h1, h2

T1, T2

P(h1 | T1) > P(h2 | T2)

If P(T1) = P(T2)

P(h1) = P(h1 | T1)P(T1) > P(h2 | T2)P(T2) = P(h2)

Henderson

(2014)

Rasmussen and Gharamani

Intuitively, this doesn’t bother me a huge amount. Even if it ends up being underdetermined how to do this, my guess is that reasonable ways of individuating high-level theories will still constrain our inferences a lot. But, maybe not, I haven’t thought about it much.

Syntax vs. ontology

syntax (in an arbitrary language)

this post

at the level of fundamental ontological commitments, which has the consequence that hypotheses contained in more complex theories get lower prior mass.

syntactic

ontology

correlates

that

the streetlight effect