Occam’s Razor/Ockham’s Razor

Occam’s razor (also spelled Ockham’s razor) is a law of parsimony: the principle gives precedence to simplicity; of two competing theories, the simplest explanation of an entity is to be preferred. The principle is also expressed as ‘Entities are not to be multiplied beyond necessity’.

There is no free lunch (Wolpert 1996) for Occam’s razor, so it can not be proved from first principles. However, if when choosing priors, one applies the ‘principle of insufficient reason’ to models (such as ax + b), rather than functions (such as 3x + 4), then Bayesian inference inherently prefers simpler models because the likelihood, P(data|hypothesis), of more complex models is more spread out. For example, suppose that I throw n dice and tell you that the spots sum to 3. How many dice did I throw? P(3 spots|1 dice) = 1/6 = 0.1667, P(3 spots|2 dice) = 2×(1/6)×(1/6) = 0.0556, P(3 spots|3 dice) = (1/6)×(1/6)×(1/6) = 0.0046. Plugging this into Bayes formula shows that, even with equal priors, the simpler model (one die) is to be preferred.

