Basic D&D Statistics: Dice Rolling with Advantage

D&D Statistics Series: Parts 1, 2

In the previous part of this series, we explored how to build the probability distributions for sums of dice rolls. That is sufficient for a great fraction of dice rolling in D&D, but a critical extra case to be considered are rolls made with advantage or disadvantage. In this article, we investigate how to extend our definitions to include die rolls with (dis)advantage.


The probability distribution of throwing a single dice is well known and easy to write down — for a fair dice, every face is equally likely. As we saw in the previous article, we can then write down the probability distribution for multiple die rolls summed together using convolutions, where the base case was always the simple uniform distribution of a single dice.

Dungeons and Dragons doesn’t always use fair dice, though. There are situations or abilities where you are given better-than-normal chances by taking the better or two roles, and similarly times when you are penalized and have to take the worse of two roles. There is still only a single value which is taken from these cases, so the uniform probability base case is replaced by a new probability distribution that takes the better/worse of rolls.

In this article, we will derive the probability distributions for these biased rolls — typically called rolling with advantage and disadvantage.

Rolling with advantage

Specifically, we define rolling with advantage — which will be denoted as \(M\dd N\adv\) for \(M\) advantage rolls of an \(N\)-sided die — to be result of taking the maximum result of two dice rolls: \begin{align} \Prob{r \in \dd N\adv} \equiv \Prob{ \left\{ r = \operatorname{max}(r_1, r_2) : r_1 \in \dd N, r_2 \in \dd N \right\} } \label{eqn:advdie_statement_defn} \end{align} and likewise, \(M \dd N\dis\) will be rolling with disadvantage wherein the minimum of two rolls is used instead.

Breaking down this definition to an explicit distribution requires precisely defining and interpreting the statement. Let us define the problem in terms of two identifiable dice, corresponding to the two possible values \(r_1\) and \(r_2\), respectively. In the case of advantage rolls, the maximum over two possibilities can be expanded to two distinct case: \begin{align} r = \operatorname{max}(r_1, r_2) = \begin{cases} r_1 & \text{if } r_2 \le r_1 \\ r_2 & \text{otherwise} \end{cases} \label{eqn:max_verbose} \end{align} Thinking in terms of these cases, we can start building up the probability statement.

First, let us work with the first case where \(r_1 \ge r_2\). We want to write the probability in terms of the result \(r\), so the first requirement is that \(r = r_1\), which occurs with probability \begin{align*} \Prob{r = r_1 \in \dd N} &= \frac{1}{N} \quad\text{assuming } 1 \le r \le N \end{align*} The first case then requires that \(r_2 \le r_1\), which will only occur if \(r_2\) is equal to any of the values \(\{1, 2, \dots, r_1\}\). The probability of \(r_2\) being equal to any given value is again \(1/N\), but now there are \(r_1 = r\) options; therefore \begin{align*} \Prob{r_2 \le r \in \dd N} &= \sum_{i = 1}^r \Prob{r_2 = i \in \dd N} = \sum_{i = 1}^r \frac{1}{N} \\ {} &= \frac{r}{N} \quad\text{assuming } 1 \le r \le N \end{align*} Together, these two statements allow us to write that the probability of rolling two dice and finding the first die is greater than the second (\(r_2 \le r_1\)) and that it is equal to our target value (\(r_1 = r\)) is \begin{align} \Prob{r = r_1 \in \dd N} \, \Prob{r_2 \le r_1 \in \dd N} &= \frac{r}{N^2} \label{eqn:advdie_max1} \end{align}

By symmetry, we can easily write down the probability for the second case in the maximum where \(r_1 \le r_2\) by just swapping identification of \(r_1 \leftrightarrow r_2\) in Eqn. \(\ref{eqn:advdie_max1}\): \begin{align} \Prob{r = r_2 \in \dd N} \, \Prob{r_1 \le r_2 \in \dd N} &= \frac{r}{N^2} \label{eqn:advdie_max2} \end{align}

The sum of Eqns. \(\ref{eqn:advdie_max1}\) and \(\ref{eqn:advdie_max2}\) is almost the answer. The subtle issue is that in Eqn. \(\ref{eqn:max_verbose}\) we wrote the conditions as “\(r_2 \le r_1\)” and “otherwise” which implies \(r_1 < r_2\). Notice that by symmetric exchange of labels, though, we have “\(\le\)” conditions in both of the probability statements rather than a single “\(\le\)” and then a “\(<\)” condition (without equality). This subtle distinction means that we’ve actually double-counted the probability for the case where \(r_1 = r_2\), and therefore the probability will be overestimated. The solution is simple, though — we simply add an additional correction term corrects for the double-counting, which we know is \begin{align*} \Prob{r_1 = r_2 = r \in \dd N} = \Prob{r_1 = r \in \dd N} \, \Prob{r_2 = r \in \dd N} = \frac{1}{N^2} \end{align*}

Putting all of the pieces together, we can simplify the two separate conditions into a single statement with double the probability because our choice of \(r_1\) vs \(r_2\) is arbitrary, and then subtracting off the correction term for double-counting, we have that the probability of rolling a die with advantage is given by: \begin{align} \Prob{r \in \dd N\adv} &\equiv 2 \, \Prob{r_1 = r \in\dd N} \, \Prob{r_2 \le r_1 \in\dd N} - \Prob{r_1 = r_2 = r \in\dd N} \label{eqn:dieadv_defn} \end{align} or expanded explicitly into functional form, \begin{align} \Prob{r \in \dd N\adv} &= \begin{cases} \frac{2r - 1}{N^2} & 1 \le r \le N \\ 0 & \text{otherwise} \end{cases} \label{eqn:dieadv_function} \end{align} with corresponding definition as a Julia function:

using OffsetArrays

function prob_dNadv(N)
    return OffsetVector([(2i-1)/N^2 for i in 1:N], 1:N)

Thankfully, we can also check this result through numerical experiments. We can easily draw hundreds of thousands of random die throws, take the larger one of pairs, and then histogram the resulting numbers to build a frequency distribution. The observed frequency distribution is expected to converge to a scaled version of the probability distribution as the number of draws approaches infinity (with some small residual random fluctuation). Doing just that:

using Plots, Random

Nrolls = 100_000
rollpairs = rand(1:20, Nrolls, 2)
advrolls  = maximum(rollpairs, dims = 2)

stephist(advrolls, bins = 0.5:20.5, normalize = :probability,
         label = "simulation")
sticks!(prob_dNadv(20), label = "d20adv theory")
xlabel!("die roll")
title!("Advantage d20 die rolling, simulation vs theory")
A comparison of the expected probability distribution for an advantaged \(\dd20\) roll (orange sticks) versus the histogram of \(100\,000\) simulated advantage rolls (blue stairs, normalized to unity). The simulation matches well with the expected probability distribution, clearly showing the same upward-stepping of the distribution toward higher values.

We see that the calculated probability distribution (the orange sticks) agrees well with the simulated distribution (blue stairs). (We will quantify what “agrees well” means in a future article.)

Rolling with disadvantage

Following a very similar set of derivations, we can also work out the probability distribution for disadvantage rolls; with a little bit of intuition, it should be no surprise that the distribution “slopes” the opposite direction with the greatest probability being for \(1\) and the lowest to achieve \(N\), with the analytic distribution being given by \begin{align} \Prob{r \in \dd N\dis} &= \begin{cases} \frac{2(N - r) + 1}{N^2} & 1 \le r \le N \\ 0 & \text{otherwise} \end{cases} \label{eqn:diedis_function} \end{align}


As mentioned in the introduction, the advantage and disadvantage probability distributions are new base cases which can be summed with other dice rolls as discussed in the previous article. Therefore, I have updated the attached DiceRolling.jl module to include new advantage (e.g. 1d20adv) and disadvantage (1d20dis) fundamental dice which can be used in dice roll expressions.

Because we’ve already seen the 2d20adv case (which is often used in saving throws and attack rolls), let’s use a less common example to demonstrate how advantage and disadvantage bias the entire distribution over many rolls. Instead of accumulating hit points using straight rolls, we’ll consider a character whose HP is taken from all advantage rolls — \(9\dd10 + 10\) vs \(9\dd10\adv + 10\) for a 10th level paladin:

Comparison of the sum of \(9\dd10 + 10\) rolled without and with advantage. As expected, the center of the distribution is biased upwards in the with advantage case compared to the standard distribution.

Unlike the base cases \(1\dd10\) distribution which monotonically increases from \(1\) through \(10\), the sum over many such distributions is approximately Gaussian1 again, in accordance with the Central Limit Theorem.

  1. If you compare the distributions carefully, though, you should find that the advantage distribution is perceptibly less Gaussian than the normal distribution — each term being unsymmetric lowers the rate of convergence toward the Gaussian distribution. ↩︎