Using the NORMDIST function to find probabilities

SOBEted

New Member
Joined
Mar 19, 2015
Messages
23
So, I have the per game average of four bowlers and their per game standard deviations:

John - Avg. 200; SD 15 (variance 225)
Paul - Avg. 180; SD 16 (variance 256)
George - Avg. 175; SD 18 (variance 324)
Ringo - Avg. 170; SD 18 (variance 324)

I want to know the probabilities of each one winning the match. Can I use the NORMDIST function in excel for this problem?

I can use it for two players: =1-NORMDIST(0, the mean difference between the two players averages, add the players total variance and take sqrt. to get the Stand deviation, TRUE)

I just don't know how to use this function for more than 2 players.

Thank You.
 
Last edited:

Excel Facts

Whats the difference between CONCAT and CONCATENATE?
The newer CONCAT function can reference a range of cells. =CONCATENATE(A1,A2,A3,A4,A5) becomes =CONCAT(A1:A5)
Interesting question. One approach is to perform a Monte Carlo simulation, as shown in the small example below. The inputs are in the blue cells, and the results of the match simulations are shown in the green cells. The simulations rely on generating a uniformly distributed random number, and then feeding that number into the NORMINV function to generate a normally distributed bowling score given the player's mean and standard deviation. This is done for each player for each match, and the maximum value is then determined and awarded the victory. The results block simply counts up the number of victories for each player and divides by the number of matches for a win percentage. RAND is a volatile function, so the sheet will regularly update each time edits are made or F9 is hit. You can click on the clipboard icon in the upper left (intersection of row and column labels) to copy this for easy pasting into cell A1 of a blank worksheet . You should consider extending the number of matches down the sheet to perform several hundred simulations to develop a better idea of the result statistics...just update the cell references to reflect the total range of the simulation table.
Cell Formulas
RangeFormula
B5:E5B5=COUNTIF($F$9:$F$28,B2)/COUNTA($F$9:$F$28)
B9:E28B9=NORMINV(RAND(),B$3,B$4)
F9:F28F9=INDEX($B$8:$E$8,MATCH(MAX(B9:E9),B9:E9,0))
 
Upvote 0
Thanks very much for your reply. The Monte carlo seems to way to go....I was just curious as to whether there was a way to do it numerically.

As an aside....you recommend running a sample of a hundred million trials which is a good idea....but how can I do that in excel where the max number of rows is slightly over a million?
 
Last edited:
Upvote 0
You're welcome. I recommend several hundred (may 300 to 500) simulations...possibly a few thousand. As the number of simulations increases, you shouldn't see as much variation in the results. I should have mentioned that this could be tweaked a bit to round the scores to integers, but then you would likely have tie-breaks to deal with.
 
Upvote 0
Here is a version with the formula dragged down to row 2050 and the user can input the number of simulations desired in cell A7. If you extend the formulas down the sheet even further, just update the range referred to in the formulas in B5:E5.
Cell Formulas
RangeFormula
B5:E5B5=COUNTIF($F$9:$F$2050,B2)/$A$7
A9:A29A9=IF(ROWS(A$9:A9)<=$A$7,ROWS(A$9:A9),"")
B9:E29B9=IF(ISNUMBER($A9),NORMINV(RAND(),B$3,B$4),"")
F9:F29F9=IFERROR(INDEX($B$8:$E$8,MATCH(MAX(B9:E9),B9:E9,0)),"")
 
Upvote 0
A quick follow-up. I was curious about how many simulation trials would be needed before the results became fairly consistent. My previous suggestion underestimated the number of trials. The bowling score distributions are fairly broad, which means that a larger number of simulation trials are necessary. I pulled the formulas down to row 50008 (to accommodate up to 50000 trials). When the user specifies 40000 to 50000 simulations the results tend to be fairly consistent with each new recalculation (forced with F9)...with John's win percentage varying by about one percentage point, perhaps a little less, and the other bowlers' win percentages varying by a few tenths of a percentage point. You could pull the formulas down even further and update the formula in B5:E5 to capture the total expanded range if you want to narrow these results even further. So in this case, better guidance would be to run several tens of thousands to perhaps a hundred thousand simulations.
MrExcel20200915.xlsx
ABCDEF
1Inputs:
2JohnPaulGeorgeRingo
3mean200180175170
4stdev15161818
5Wins68.99%14.00%10.53%6.48%
6
750000<-- Number of Monte Carlo Simulations
8MatchJohnPaulGeorgeRingoWinner
91174.5466171.9341169.788206.0567Ringo
102196.1404175.1209192.5911152.0164John
113216.6413206.0277185.1446186.9642John
SOBEted
Cell Formulas
RangeFormula
B5:E5B5=COUNTIF($F$9:$F$50008,B2)/$A$7
A9:A11A9=IF(ROWS(A$9:A9)<=$A$7,ROWS(A$9:A9),"")
B9:E11B9=IF(ISNUMBER($A9),NORMINV(RAND(),B$3,B$4),"")
F9:F11F9=IFERROR(INDEX($B$8:$E$8,MATCH(MAX(B9:E9),B9:E9,0)),"")
 
Upvote 0
Even with 2,000,000 runs the results are not stable, if I am not mistaken:
MrExcel_Using the NORMDIST function to find probabilities.xlsm
BCDE
169,20%13,96%10,38%6,46%
2JohnPaulGeorgeRingo
Tabelle1

Ringo's percentage is 6,45% or 6,46%

My code:
VBA Code:
Sub monte()
Dim i As Long, j As Long, k As Long, p As Long, n As Long
Dim d As Double, dmaxi As Double
Dim m(1 To 10) As Double 'mean
Dim s(1 To 10) As Double 'stdev
Dim w(1 To 10) As Long 'wins

Randomize
k = [A7]
j = 2
Do While Not (IsEmpty(Cells(2, j)))
    m(j - 1) = Cells(3, j).Value
    s(j - 1) = Cells(4, j).Value
    j = j + 1
Loop
j = j - 2
For i = 1 To k
    dmaxi = -1E+300
    For p = 1 To j
        d = Rnd(): If d = 0# Then d = 0.5
        d = Application.WorksheetFunction.NormInv(d, m(p), s(p))
        If d > dmaxi Then dmaxi = d: n = p
    Next p
    w(n) = w(n) + 1
Next i
For i = 1 To j
    Cells(1, i + 1) = w(i) / k
Next i
End Sub
 
Upvote 0
BTW: With 100,000,000 runs I get:
MrExcel_Using the NORMDIST function to find probabilities.xlsm
BCDE
169,23%13,94%10,38%6,46%
2JohnPaulGeorgeRingo
Tabelle1
 
Upvote 0
@Sulprobil, thanks for taking this analysis further. You're not mistaken...one of my 50,000-run simulations shown above gave (68.99%, 14.00%, 10.53%, 6.48%). I mentioned that multiple re-runs of the 50k simulation showed that the first bowler's win percentage (John) tended to have a variation no greater than one percentage point, and the others varied by no more than a few tenths of a percentage point from the values shown. Depending on needs, that may or may not have been close enough. The rate at which answers converge is dependent on the number, positions, and shapes of the probability distributions. In this case, it appears that a considerable number of trials are necessary in order to obtain better convergence. Your conversion of these simulations to code is a much better and more practical way to tackle the problem. I appreciate seeing the refined and converged answers.
 
Upvote 0

Forum statistics

Threads
1,214,918
Messages
6,122,241
Members
449,075
Latest member
staticfluids

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top