A simple probability problem - Biased and Inefficient

Amy Hogan, a stats and maths teacher who blogs at A Little Stats, posted the following quiz on twitter:

(Assuming fair dice) which has the highest probability:

1 six from 6 dice

2 sixes from 12 dice

3 sixes from 18 dice

The calculations aren’t too hard even by hand, and we have pbinom() available (if we remember to check $<$ vs $\leq$ conditions). In that sense the question is easy, but I was looking for an intuitive argument.

Obviously, the probability of exactly $n$ sixes from $6 n$ dice is decreasing in $n$ , because the distribution is becoming less discrete. On the other hand, the probability of more than $n$ sixes is increasing towards 1/2, since the distribution is becoming more symmetric. It isn’t obvious to me which one wins.

Although I’d never encountered this before, it turns out to be a real classic. Isaac Newton answered it for Samuel Pepys, and got the brute-force calculations right, but then came up with an incorrect heuristic argument. Stephen Stigler has a paper, Joe Blitzstein pointed me to it before I wasted too much time.

The neatest relevant fact is that the difference between the median and mean of a Binomial distribution is strictly less than 1, and so when the mean is an integer the two are equal. That implies the sequence $P [B i n (n k, 1 / k) \geq n]$ will tend to decrease with increasing $n$ for any $k$ , but even that doesn’t quite prove the sequence is strictly monotone: we only know the probability is between $0.5$ and $0.5 + P [B i n (n k, 1 / k) = n]$ . Also, there’s apparently no simple intuition behind the bound on the difference between mean and median.

In the end, it turns out to be true that $P [B i n (n k, 1 / k) \geq n]$ is decreasing in $n$ for any integer $k$ , but (pretty obviously) $B i n (n k, p)$ doesn’t have to be decreasing with $n$ for general $p$ . Any valid intuition has to take advantage of $p = 1 / k$ . Stigler seems to think that’s an important barrier; I’m not convinced. Perhaps more off-putting, any valid intuitive argument would probably have to make it obvious that the mean and median were equal when the mean is an integer.