Another update on non-transitive dice

I’ve mentioned before that mathematician Tim Gowers had run a ‘polymath’ (massively collaborative maths research) project on non-transitive dice. There’s an arXiv preprint. There’s also a detailed write-up in Quanta, which is a magazine devoted to popular explanations of maths.

As I’ve said before, this is statistically interesting (as well as being just interesting) because any instance of non-transitive dice is also an instance of a non-transitive Wilcoxon/Mann-Whitney test. So what do we now know about the Wilcoxon test?

The mathematicians looked at dice with $$n$$ faces whose values were sampled (with replacement) from $$1,\dots,n$$. That is, they looked at a specific class of roughly uniform distributions. For these distributions, there were basically two cases

• if the means of three distributions were different then the dice/Wilcoxon tests were ordered the same way as the means (ie, the $$t$$-test), with high probability
• if the means were all the same, there was almost as much non-transitivity as possible: $$A$$ beats $$B$$ and $$B$$ beats $$C$$ gave almost no information about whether $$A$$ beats $$C$$.

The shape of the distributions is relevant because the distribution of ranks is uniform: exactly, for a single sample, and approximately, for a set of samples from the same distribution. So, another way of phrasing the statement that the Wilcoxon test is a comparison of the mean rank is to say that the Wilcoxon test is a test of the mean if the data have the sort of roughly-uniform distribution that ranks do under the null hypothesis that all the distributions are nearly the same.

The disadvantage of this formulation is that it’s less precise; the advantage is that it is in terms of single-sample summary statistics rather than summaries of the combined samples.