Friday, November 18, 2016

Testing Dice for Fairness

Are Chessex dice fair; are Koplow dice fair; how to test dice.

Testing dice for fairness is tedious but simple: roll the dice a large number of times and tally how often each face comes up.

Here is an example where a set of Chessex dice are rolled 100 times each:


We don't expect the numbers to be exactly the same, even if the die is fair. Any outcome is possible from a fair die, though some outcomes, such as seeing a 6 a hundred times and the other faces not at all, are vanishingly unlikely. How do we recognize implausible results from a fair die?

My time in the statistics department at Ohio State acquainted me with a test statistic which can be used to answer the question.

Let n be the number of sides the die has. Let Oi be the number of times we observe the i-face to come up. Let Ei as the number of times we expect the i-face to come up, assuming the die is fair. The test statistic is:

$$ χ^2 = \sum_{i=1}^n \frac{(O_i - E_i)^2}{E_i} $$
The test statistic has a Chi-squared distribution with - 1 degrees of freedom. It is used to assign a p-value to the result, which is the chance that a fair die would produce results as or more extreme than what we observed. If you are interested, here is some code for making the calculation. The closer the p-value is to zero, the stronger the evidence that the die is not fair.

I took a set of Koplow dice and a set of Chessex dice and rolled each die 100 times. The only die which had a p-value less than .05 was the Koplow d6. However, if we compute p-values for 5 fair dice, there is a 22.6% chance that one of the dice will have a p-value less than .05. To account for this, I applied the Bonferroni correction, which raised the p-value of the Koplow d6 to 0.11.

So as far as I can tell, my Koplow dice and my Chessex dice are fair. It doesn't mean yours are too, but you can test them.