I have a d20 that seems to be, well, remarkably lucky.* How can I determine whether it's really just luck, or whether the die is in fact unfairly biased?

^{*) Well, I don't, really. This is actually a spin-off from this question, which is specifically about determining whether a die is loaded. This one is intended as a more general question about how to detect any kind of bias in dice, since we apparently don't have one yet. I've posted my own answer below, but feel free to add more.}

## Best Answer

## What kinds of bias can dice have?

Lots of kinds, actually. Perhaps the most common accidentally occurring types of bias are:

"Shaved" dice, which are not quite symmetrical, but slightly wider or narrower on one axis than on others. A shaved d6 with, say, the 1–6 axis longer than the others will roll those sides less often, making it "less swingy" than a fair d6 should be (but leaving the average roll unchanged). The name comes from cheaters actually shaving or sanding down dice to flatten them, but cheap dice may have this kind of bias simply due to being poorly made. Other similar biases due to asymmetric shape are also possible, especially in dice with many sides.

Uneven (concave / convex) faces may be more or less likely to "stick" to the table, favoring or disfavoring the opposite side. The precise effect may depend on the table material, and on how the dice are rolled. Again, cheap plastic dice case easily have this kind of bias, e.g. due to the plastic shrinking unevenly as it cools after molding. Uneven

edgescan also create bias, particularly if the edge is asymmetric (i.e. sharper on one side).Actual "loaded" dice, i.e. dice with a center of gravity offset from their geometric center, may occur accidentally due to either bubbles trapped inside the plastic or, more commonly, simply due to the embossed numbers on the sides of the die affecting the balance. In fact, almost

alldice, with the exception of high-quality casino dice deliberately balanced to avoid this kind of bias, will likely have it to some small extent.## How do I find out whether a die is fair?

Obviously, you need to

roll it. Preferably, you should do this the same way, on the same kind of table, as you'd use in a game; while truly fair dice should be fair on any surface, some types of bias may show up only on some surfaces.Keep rolling the same die several times, and count how many times each side comes up. If you've got a friend to help you, you can have them tally up the rolls as you call them out, so you don't have to switch between rolling and marking the results all the time. Once your arm gets tired of rolling dice, switch roles.

## How many times do you need to roll?

For the type of statistical test described below (Pearson's \$\chi^2\$ test), a common rule of thumb is to have at least

five times as manyrolls as there are sides on the die. Thus, for a d20, you needat least100 rolls for the test to be valid. (There are other statistical tests that can be used with fewer rolls, but they require slightly more complicated math.) Obviously, more rolls won't hurt if you have the patience for it, and the more rolls you tally up, the better the test will detect subtle biases.(Note: If you've, say, bought a large bunch of cheap d6's for rolling large dice pools, it

canbe OK to just roll them all together and tally up the number of times each face comes up. Sure, this way you won't detect if one of the dice is, say, slightly more likely to roll a 6, while another one is slightlylesslikely to roll it, but you'll still detect anysystematicbiases due to, say, all the dice being unsymmetrical the same way.)## OK, I've rolled the die 100 times. Now what?

Now it's time to do some math.

First, look up the tally of how many times each side came up. Below, I'll call the number of times side 1 came up \$n_1\$, the number of times side 2 came up \$n_2\$, and so on up to \$n_{20}\$ for a d20. I'll also use \$N\$ to denote the total number of rolls, i.e. \$N = n_1 + n_2 + \dots + n_{20}\$.

Next, calculate the

expectednumber of times each side should have come up for a fair dice, i.e. the total number of rolls divided by the number of sides.^{1}(It's OK for this to be a fractional number.) Call this number \$n_{\exp}\$. For example, for \$N = 100\$ rolls of a d20, \$n_{\exp} = \frac{N}{20} = 5\$.Now, for each side

k(from 1 to 20, for a d20), calculate thedifferencebetween the actual and the expected count of times the side came up, square it (i.e. multiply it by itself), and divide it by the expected count. That is, calculate:$$\chi^2_k = \frac{ \left( n_k - n_{\exp} \right) ^2}{n_{\exp}}$$

for each possible number \$k\$ of your die (i.e. from \$k = 1\$ to \$k = 20\$, for a d20).

^{2}Finally, add up all the results from the previous step to obtain the test statistic $$\chi^2 = \chi^2_1 + \chi^2_2 + \dots + \chi^2_{20} = \sum_{k=1}^{20} \frac{ \left( n_k - n_{\exp} \right) ^2}{n_{\exp}}.$$

## OK, I've got this \$\chi^2\$ figure. What do I do with it?

The \$\chi^2\$ value you've calculated is a measure of how biased the die appears to be, based on the numbers you've rolled with it. But what counts as a reasonable value of \$\chi^2\$, and where's the threshold at which you should start getting suspicious?

For that, you either need to do some more math, or, more easily, just look it up in a table.

To use the table, you first need to know how many "degrees of freedom" our test has. That's simpler than it sounds: for a \$d\$-sided die, the test has \$\nu = d - 1\$ degrees of freedom (i.e. \$\nu = 19\$ for a d20).

^{3}This will tell you which row in the table to look at.In the table above, row 19 looks like this:

What does this mean? Well, it means that,

ifthe die is actually fair, then \$\chi^2\$ will be less than 27.204 in 90% of all tests, less than 30.144 in 95% of all tests, and so on. Only once in a thousand tests will a fair d20 actually produce a \$\chi^2\$ value higher than 43.820.Thus, by comparing \$\chi^2\$ to the critical values in the table, you can estimate how likely it is to be biased.

^{4}If \$\chi^2 \le 27\$, the die probably has no bias, or at least you haven't counted enough rolls to detect it; around \$\chi^2 \ge 30\$ or so, youmightwant to be concerned, and maybe set the die aside for further testing; if \$\chi^2 \ge 40\$, you can declare the die biased with pretty high confidence.Note that the chi-squared test does

notsay anything abouthowthe die is biased: a die that, say, rolls 10 more often and 11 less often than it should is just as likely to fail the test as one that rolls 20 more often and 1 less often. Of course, if the chi-squared test does detect bias, you can just look at the tally counts yourself to see which ones occur more often than you'd expect.Ps. For convenience, here are the table rows for a few other commonly used types of dice:

^{5}Footnotes:^{1) For an ordinary fair die, the expected number of times each side comes up is obviously the same, but we could use the chi-squared test also for dice that we don't expect to roll each number equally often (like, say, dice where the same number appears several times). In that case, we'd just have a different \$n_{\exp}\$ for each possible roll of the die.}^{2) I'm not aware of a conventional symbol for these intermediate values, but \$\chi^2_k\$ seems like a reasonable choice, given both that they add up to the test statistic \$\chi^2\$, and that each of them is the square of an (approximately) normally distributed random variable, and thus is itself \$\chi^2\$-distributed. Your favorite statistics text, if it bothers to give them a symbol at all, may use something else.}^{3) The number of degrees of freedom is essentially the number of values in our measurements that can vary independently. Here, we're measuring 20 values, \$n_1\$ to \$n_{20}\$, but they're not quite independent: we know that \$n_1 + n_2 + \dots + n_{20} = N\$, so once we know 19 of the values, we can calculate the last one based on the other 19. Hence, 19 degrees of freedom.}^{4) Note that the numbers in the table header give the probability that a perfectly fair die will produce a \$\chi^2\$ value higher than the critical value in that column. This is not the same as the probability that a die with \$\chi^2\$ less than the critical value is fair, or that a die with \$\chi^2\$ higher than the critical value is biased; to calculate those probabilities, you'd first have to know the a priori frequency of bias among your dice. Indeed, in some sense, these questions are not even meaningful to ask: truly fair dice only exist in the platonic realm of ideas, and every real die almost certainly has some bias, if you measure it carefully enough. Thus, in a sense, any claim that a given die is fair is false; all we can really say is that it's close enough to fair that we can't tell the difference.}^{5) A "d2" is, of course, a coin. Use the "d3" column (\$\nu = 2\$) e.g. for Fudge dice.}Addendum:So, just how many rolls do we need toactually detect biased dice? Well, I did some quick simulation tests, using anextremelybiased virtual d20 thatneverrolls a 1, and rolls 20 twice as often as it should. Using the different \$\chi^2\$ thresholds given in the table above, and various numbers of test rolls, from the minimum of 100 up to 400, here's the fraction of runs on which the \$\chi^2\$ value exceeded the threshold:In each case, the probability of falsely detecting bias in a fair die is essentially independent of the number of rolls — this is a deliberate feature of the \$\chi^2\$ test. The probability of

correctlydetecting the biased die, however, increases significantly with more rolls.From the table above, we can see that 100 rolls (the minimum number for the \$\chi^2\$ test to even be valid) is way too little to detect even such an egregious bias: even if we set the \$\chi^2\$ threshold so low that we end up rejecting 10% of all fair dice, we still catch only about 50% of the biased ones, and it only gets worse as we increase the threshold.

On the other hand, with 400 rolls, things look a lot better: setting the threshold at \$\chi^2 \le 36.191\$, 99% of all fair dice will pass this test, while about 98.5% of all the biased dice in this test will fail it. (Of course, we're still talking about

verystrongly biased dice here; more subtle bias will be harder to detect.)OK, but surely a die that

neverrolls 1 should beeasyto spot? After all, with a fair d20, the probability of rolling 100 times and never seeing a 1 is only \$\left(\frac{19}{20}\right)^{100} \approx 0.006\$. Shouldn't that beplentyof reason to consider the die biased? What gives?Well, one reason why the \$\chi^2\$ test seems so ineffective here is that it's looking for

anykind of bias. Sure, if we rolled a d20 a hundred times, and never saw a 1, we might be justifiably suspicious. But what if we never saw a 7, or a 15, or any of the other possible rolls? Would thosealsobe reason to call the die biased?Well, it turns out that, even though the probability of never rolling a 1 in 100 rolls on a d20 is only about 0.6%, the probability of never rolling

somenumber is about 20 times that, or about 12%. So if we rejected all 20-sided dice that never rolled some number in 100 rolls, we'd end up rejecting about 12% of all fair dice, too. And, of course, there also are manyotherkinds of possible biases that the \$\chi^2\$ test will also detect; thus, with just 100 rolls, it's actually quite likely to detectsomebias even in a d20 that's perfectly fair, and so we need to set the threshold value quite high to compensate.If we were

onlyinterested in bias affecting the most extreme rolls (1 and 20), we could modify the \$\chi^2\$ test to e.g. lump all the rolls between 2 and 19 into a single category, with \$n_{\exp} = \frac{18 \times N}{20}\$, and use the \$\chi^2\$ threshold for two degrees of freedom (since we now have only three possible outcomes: 1, 20, or something else). Such a modified \$\chi^2\$ test is alotbetter at detecting this particular form of bias, with more than half of the biased dice failing the test at the 1% false positive rate even with just 100 rolls, and over 99.99% of them failing it with 200 rolls.Of course, the price we pay for that extra discriminatory power is that this modified test will be

completely obliviousto mostotherkinds of bias — for example, it will happily pass a die that never rolls a2, and that rolls19twice as often as it should.