[RPG] Struggling with rolling for stats probability calculation

anydicestatistics

I tried finding a specific statistic calculator, even tried playing with anydice.com, with no success. My friends love to roll dice, but we also want the same playing field when starting a campaign, so we thought of this idea:
Each player rolls for their starting stats and those are the viable arrays to pick from (assumptions is everyone picks the same best one rolled by one of the players). Of course you can't do 4d6 drop lowest as that would be too strong compared to regular style.

So how can I calculate a statistical fairness as close to original 4d6 drop lowest? Can we do just straight up 3d6 rolling? If there are 4 players, each rolling 3d6 6 times, and all pick the best array that feels OK-ish.

I've read this article and tried to use the commands to simulate my result. When I tried it out with 3d6 it showed up to 4 times lower odds of 18, compared to 4d6 drop lowest, so if we roll it 4 times, it should equal in theory, or be sliiightly better?

I understand the more players, the better this variant score would be. So I'm counting with 4 players rolling.

Best Answer

4 arrays of 3D6 is slightly worse than 4D6d1

In essence this question is asking what the roll mechanic should be, such that when able to choose from a set of four arrays the expected power is approximately the same as the 4D6D1 method. This question sparked my interest so I set about writing a program to simulate it.

I choose to evaluate the quality of an array by the number of points it would require to purchase using the point buy system. (More details further down).

I'm going to put the results right up front because the methodology is a little complex. But simple put here is the probability distribution for the proposed system.

enter image description here

The average point buy values of each system are:

  • Standard 4D6d1: 31.00
  • Simple 3d6: 16.00
  • Choose 4x 4d6d1: 42.00
  • Choose 4x 3d6: 27.00

So we can quickly conclude that 4 arrays of 3D6 is slightly worse that the standard 4D6d1 method, but is closer than either of the other methods.

Methodology

I used a script to simulate 10,000 iterations per method. However I needed a method to objectively assess which array was the 'best' among the set of available options. Luckily D&D 5e already has a method for determining the value of a set of stats, point buy.

Why point buy?

The point buy system attached a value to the various ability scores, a single high score is often more valuable than a few medium scores, the point buy system reflects this. Normally point buy is limited to a maximum of 15 and a mimimum of 8 however since dice rolls can fall outside this range I ignored this and used the point costs from this calculator for higher and lower values.

The reasons this is a better metric for quality of an array can be demonstrated with the following example from OP's comment on another answer. Take 2 arrays:

  • Array 1: 18, 18, 18, 10, 10, 10
  • Array 2: 15, 14, 14, 14, 14, 14

As experience players we can determine that the first array is distinctly better. However a simplistic summation of the scores gives 84 and 85 respectively. The would indicate that Array 2 is, objectively, better. This is incorrect.

Using the point buy system instead Array 1 has a point-buy cost of 63 while Array 2 is only 44. Indicating the Array 1 is significantly better than Array 2. This aligns with our view of these stats as subjective players and is therefore a better metric.

Other systems (e.g pathfinder) use different values for point buy and therefore would have slightly different results. However I believe that using point-buy instead of total score would consistently show the 'better' array more often.

Code details

To figure this out I wrote a python program to simulate the varies scenarios. First I simulate a dice roll. Then simulate the 2 attribute generation methods.

def RollD6():
    return np.random.randint(1,7)

def RollNDice(n):
    diceValues = np.random.randint(1,7,n)
    return diceValues
    
def Roll3D6():
    return sum(RollNDice(3))

def Roll4D6D1():
    rolls = RollNDice(4)
    rolls = np.delete(rolls, rolls.argmin()) # delete lowest value
    return sum(rolls)

Next I created functions to generate arrays of stats based on the two mechanics and used a dictionary to look determine the point-buy value of that array.

PointBuyValue = {
    3: -9,
    4: -6,
    5: -4,
    6: -2,
    7: -1,
    8: 0,
    9: 1,
    10: 2,
    11: 3,
    12: 4,
    13: 5,
    14: 7,
    15: 9,
    16: 12,
    17: 15,
    18: 19
}

def RollStats4D6D1():
    stats = []
    for i in range(6):
        stats.append(Roll4D6D1())
    return stats

def RollStats3D6():
    stats = []
    for stat in range(6):
        stats.append(Roll3D6())
    return stats

def CalculatePointBuy(stats):
    pointBuyTotal = 0
    for stat in stats:
        pointBuyTotal = pointBuyTotal + PointBuyValue[stat]
    return pointBuyTotal

I verified that my normal versions were working as expected with 4D6d1 performing better than 3d6, then created functions to answer the question. The two functions below calculate the maximum point buy value from 4 sets of attributes generated, this assumes that all players would choose the 'optimal' array based on this method.

def Choose4D6D1():
    pb_max = 0
    for i in range(4):
        pb = CalculatePointBuy(RollStats4D6D1())
        if pb > pb_max:
            pb_max = pb
    return pb_max


def Choose3D6():
    pb_max = 0
    for i in range(4):
        pb = CalculatePointBuy(RollStats3D6())
        if pb > pb_max:
            pb_max = pb
    return pb_max

Finally I rapped it all up, chucked it in a juypter notebook and ran 10,000 iterations to generate some distributions. I used a fixed seed so others should be able to replicate my results.

np.random.seed(42)

n_sims = 10000
record_4d6d1 = []
record_3d6 = []
record_pick4d6d1 = []
record_pick3d6 = []

for runs in range(n_sims):
    record_4d6d1.append(CalculatePointBuy(RollStats4D6D1()))
    record_3d6.append(CalculatePointBuy(RollStats3D6()))
    record_pick4d6d1.append(Choose4D6D1())
    record_pick3d6.append(Choose3D6())

print ("Average Point buy (4d6d1): %.2f" % (np.sum(record_4d6d1)/n_sims))
print ("Average Point buy (3d6): %.2f" % (np.sum(record_3d6)/n_sims))
print ("Average Point buy (Choose 4d6d1): %.2f" %
        (np.sum(record_pick4d6d1)/n_sims))
print ("Average Point buy (Choose 3d6): %.2f" %
        (np.sum(record_pick3d6)/n_sims))

ax = sns.distplot(record_4d6d1)
plt.title("Histogram of %d simulated rolls" % n_sims)
ax.set_xlabel("Point Buy Total")
ax.set_ylabel("Count")

sns.distplot(record_3d6)
sns.distplot(record_pick4d6d1)
sns.distplot(record_pick3d6)
plt.legend(labels=['4d6d1', '3d6', 'Choose 4d6d1', 'Choose 3d6'])

I've posted the full code here so that you can see that the calculations are not actually that complex. There is just a lot of steps involved.

Detailed Results

Looking at the distributions in the chart and averages given above we can see that the Choose 3D6 method and the standard 4D6d1 method are the closest in terms of average. With 3D6 being much lower and Choose 4D6d1 much higher. Therefore we can conclude that your system is a reasonable approximation and shouldn't have any issues during play.

However, the choosing method also results in a tighter distribution, meaning the expected arrays from this method will vary by less than the traditional method. The smaller standard deviation and lower average mean that you can expect slightly fewer 'great' arrays to arise from this system that the standard one, but also for the worst arrays to not be quite so bad.

So while you could over-analysis this data and fiddle with the exact mechanics to try to better approximate the 4D6d1 distribution. Your system of choosing from 4 arrays of 3d6 will likely result in similar enough arrays that you can safely use it at the table.

Scaling to more players

Out of interest I modified my script to run work for more players. When modelling for eight players the average point buy value is 31, the same as the traditional 4D6d1.

enter image description here