Using the NORMDIST function to find probabilities

SOBEted

New Member
Joined
Mar 19, 2015
Messages
23
So, I have the per game average of four bowlers and their per game standard deviations:

John - Avg. 200; SD 15 (variance 225)
Paul - Avg. 180; SD 16 (variance 256)
George - Avg. 175; SD 18 (variance 324)
Ringo - Avg. 170; SD 18 (variance 324)

I want to know the probabilities of each one winning the match. Can I use the NORMDIST function in excel for this problem?

I can use it for two players: =1-NORMDIST(0, the mean difference between the two players averages, add the players total variance and take sqrt. to get the Stand deviation, TRUE)

I just don't know how to use this function for more than 2 players.

Thank You.
 
Last edited:
I ran a simulation in @Risk, and got results very close to Sulprobil's. With 500K iterations (which was the limit of my patience) I got:
John: 69.27%
Paul: 13.90%
George: 10.39%
Ringo: 6.44%

It seemed to me that we should be able to solve this analytically: the score difference between any two players (assuming the input distribution are normal) will be a normal distribution with a mean equal to the difference in the mean of the two players, and St Dev equal to the square root of the sum of the the squares of the std deviations of the two player's input distributions.

That would men the distribution of John's score - Paul's score should be normal((200 - 180), (sqrt(15^2 + 16^2)) = normal(20, 21.9). Using Normdist I calculate that the cumulative distribution for x=0 is ~18% - so John has an 82% (100%-18%) chance of beating Paul. Similarly, he has an 86% chance of beating George, and a 90% chance of beating Ringo.

For John to win, he has to beat all three, and 82% * 86% * 90% = 63% (but my simulation (and Subprobil's) came to ~69%. The other hint that I have the wrong result comes from the total probabilities of each player winning, which only come to 75%. Obviously, I'm missing something, but I'm not sure what - I don't think we have to add the probabilities of John winning with the different second, third, and fourth-place finishes...
 
Upvote 0

Excel Facts

How can you turn a range sideways?
Copy the range. Select a blank cell. Right-click, Paste Special, then choose Transpose.
Dean,
Thanks for the follow-up on this question. I too wondered about solving it analytically. I've encountered mathematically similar situations in reliability engineering where the interference between load (stress) and strength probability distributions is of interest to estimate expected reliability. What sets this problem apart is that we are interested in overlaying four distributions simultaneously (rather than two) and then determining the probability that each bowler finishes first. For this discussion, I've changed the names of the four players to 1, 2, 3, and 4. We can determine the difference between any two distributions (as though only two players were competing head-to-head), as you've shown. The problem is that the resulting difference distribution is for all instances where one specific player outscores another specific player. For example, when player 1 finishes in 1st place, he/she outscores everyone...this is the obvious finding. But there are still other cases where player 1 does not finish in 1st place, yet he/she still outscores some of the other players. To illustrate this more clearly, see the worksheet below.

At the top, we have the four players and their score distribution characteristics. Below that, we have the computation of the head-to-head distribution characteristics using the inverse of the cumulative normal distribution...just as you've described. For example, P(12) in this table is the probability that team 1 achieves a higher score than team 2, and Phi(SM) (or F(SM)...looks like XL2BB doesn't like Greek letters) is that probability based on the safety margin or z score of the difference distribution. As you've described above, P(12)=0.8191, and as expected P(21)=1-P(12)=0.1809. But now we have to consider the various ways that player 1 might outscore player 2. A separate table shows the possible outcomes, in terms of finishing 1st, 2nd, 3rd, and 4th in the four-way competition. The cells shaded yellow show the six ways that team 1 finishes first, but there are still six other ways that team 1 outscores team 2, when team 1 does not finish 1st. To illustrate this more clearly, a table constructed to the right gives a breakdown of the all instances where team 1 outscores team 2 (see the orange column with heading "1>2". Anywhere that a "1" appears means that team 1 has outscored team 2, and if we could replace the 1's with actual probabilities then we should find that the sum of those probabilities equals P(12). At the moment, the sum is 12, but if the twelve probability components were known, they should sum to P(12)=0.8191 shown in the upper orange cell.

I starting playing around with conditional probability expressions (shown at right), but I'm not sure how to tease out some on them...e.g., P(23|34)...the probability that 2 outscores 3 given that 3 outscores 4. If these probability components could be determined, then the product going across each row gives the joint probability for the combination on that row. Summation of the six purple cells (when team 1 finishes 1st) should then give the total probability that team 1 finishes 1st in the four-way competition, which we would expect to be about 0.6923. If anyone has ideas about how this might be done, I would appreciate hearing more.
MrExcel20200915.xlsx
HIJKLMNOPQRSTUVWXYZAAABAC
11234
2JohnPaulGeorgeRingo
3mean200180175170
4stdev15161818
5
6Head-to-Head
7Prob(x>y)team xteam yDmsdF(SM)
8P(12)122021.931710.819095
9P(13)132523.430750.857008
10P(14)143023.430750.899792
11P(23)23524.083190.582235
12P(24)241024.083190.661012
13P(34)34525.455840.577859
14P(21)21-2021.931710.180905
15P(31)31-2523.430750.142992
16P(41)41-3023.430750.100208
17P(32)32-524.083190.417765
18P(42)42-1024.083190.338988
19P(43)43-525.455840.422141
20
21P(12)P(13)P(14)P(23)P(24)P(34)P(21)P(31)P(41)P(32)P(42)P(43)
22PossibleKnown0.81910.85700.89980.58220.66100.57790.18090.14300.10020.41780.33900.4221
23OutcomesSum121212121212121212121212
241st2nd3rd4th1>21>31>42>32>43>42>13>14>13>24>24>3Expressions for Components of Joint Distribution from Conditional ProbabilitiesSummaryJoint Prob1st Place %
251234111111      P(12^23^34) = P(12|23^34) * P(23^34) = P(12|23^34) * P(23|34)* P(34) P(12|23^34) * P(23|34)* P(34) 16
26124311111      1P(12^24^43) = P(12|24^43) * P(24^43) = P(12|24^43) * P(24|43)* P(43) P(12|24^43) * P(24|43)* P(43) 1
271324111 11   1  P(13^32^24) = P(13|32^24) * P(32^24) = P(13|32^24) * P(32|24)* P(24) P(13|32^24) * P(32|24)* P(24) 1
281342111  1   11 P(13^34^42) = P(13|34^42) * P(34^42) = P(13|34^42) * P(34|42)* P(42) P(13|34^42) * P(34|42)* P(42) 1
2914231111      11P(14^42^23) = P(14|42^23) * P(42^23) = P(14|42^23) * P(42|23)* P(23) P(14|42^23) * P(42|23)* P(23) 1
301432111      111P(14^43^32) = P(14|43^32) * P(43^32) = P(14|43^32) * P(43|32)* P(32) P(14|43^32) * P(43|32)* P(32) 1
312134 111111     P(21^13^34) = P(21|13^34) * P(13^34) = P(21|13^34) * P(13|34)* P(34)  16
322143 1111 1    1P(21^14^43) = P(21|14^43) * P(14^43) = P(21|14^43) * P(14|43)* P(43)  1
332314  111111    P(23^31^14) = P(23|31^14) * P(31^14) = P(23|31^14) * P(31|14)* P(14)  1
342341   111111   P(23^34^41) = P(23|34^41) * P(34^41) = P(23|34^41) * P(34|41)* P(41)  1
352413 1 11 1 1  1P(24^41^13) = P(24|41^13) * P(41^13) = P(24|41^13) * P(41|13)* P(13)  1
362431   11 111  1P(24^43^31) = P(24|43^31) * P(43^31) = P(24|43^31) * P(43|31)* P(31)  1
3731241 1 11 1 1  P(31^12^24) = P(31|12^24) * P(12^24) = P(31|12^24) * P(12|24)* P(24) P(31|12^24) * P(12|24)* P(24) 16
3831421 1  1 1 11 P(31^14^42) = P(31|14^42) * P(14^42) = P(31|14^42) * P(14|42)* P(42) P(31|14^42) * P(14|42)* P(42) 1
393214  1 1111 1  P(32^21^14) = P(32|21^14) * P(21^14) = P(32|21^14) * P(21|14)* P(14)  1
403241    111111  P(32^24^41) = P(32|24^41) * P(24^41) = P(32|24^41) * P(24|41)* P(41)  1
4134121    1 1111 P(34^41^12) = P(34|41^12) * P(41^12) = P(34|41^12) * P(41|12)* P(12) P(34|41^12) * P(41|12)* P(12) 1
423421     111111 P(34^42^21) = P(34|42^21) * P(42^21) = P(34|42^21) * P(42|21)* P(21)  1
43412311 1    1 11P(41^12^23) = P(41|12^23) * P(12^23) = P(41|12^23) * P(12|23)* P(23) P(41|12^23) * P(12|23)* P(23) 16
44413211      1111P(41^13^32) = P(41|13^32) * P(13^32) = P(41|13^32) * P(13|32)* P(32) P(41|13^32) * P(13|32)* P(32) 1
454213 1 1  1 1 11P(42^21^13) = P(42|21^13) * P(21^13) = P(42|21^13) * P(21|13)* P(13)  1
464231   1  111 11P(42^23^31) = P(42|23^31) * P(23^31) = P(42|23^31) * P(23|31)* P(31)  1
4743121      11111P(43^31^12) = P(43|31^12) * P(31^12) = P(43|31^12) * P(31|12)* P(12) P(43|31^12) * P(31|12)* P(12) 1
484321      111111P(43^32^21) = P(43|32^21) * P(32^21) = P(43|32^21) * P(32|21)* P(21)  1
SOBEted
Cell Formulas
RangeFormula
K8:K19K8=INDEX($I$3:$L$3,MATCH($I8,$I$1:$L$1,0))-INDEX($I$3:$L$3,MATCH($J8,$I$1:$L$1,0))
L8:L19L8=SQRT(INDEX($I$4:$L$4,MATCH($I8,$I$1:$L$1,0))^2+INDEX($I$4:$L$4,MATCH($J8,$I$1:$L$1,0))^2)
M8:M19M8=NORM.S.DIST(K8/L8,TRUE)
H8:H19H8="P("&I8&J8&")"
N21:Y21N21="P("&LEFT(N24,1)&RIGHT(N24,1)&")"
N22:Y22N22=INDEX($M$8:$M$19,MATCH(N21,$H$8:$H$19,0))
N23:Y23N23=SUM(N25:N48)
N25:Y48N25=IF(MATCH(VALUE(LEFT(N$24,1)),$I25:$L25,0)<MATCH(VALUE(RIGHT(N$24,1)),$I25:$L25,0),1,"")
Z25:Z48Z25="P("&I25&J25&"^"&J25&K25&"^"&K25&L25&") = P("&I25&J25&"|"&J25&K25&"^"&K25&L25&") * P("&J25&K25&"^"&K25&L25&") = P("&I25&J25&"|"&J25&K25&"^"&K25&L25&") * P("&J25&K25&"|"&K25&L25&")* P("&K25&L25&") "
AA25:AA48AA25=IF(N25=1,RIGHT(Z25,30),"")
AB25:AB48AB25=PRODUCT(N25:Y25)
AC25,AC43,AC37,AC31AC25=SUM(AB25:AB30)
 
Upvote 0
If bowling scores were actually normal, it could be solved exactly by numerical integration:

The probability of John winning is the sum of probabilities of ...

John getting a 1 and all others getting fewer
John getting a 2 and all others getting fewer
...
John getting a 300 and all others getting fewer
 
Upvote 0
Thanks Kirk - it will take me a while to work through that.
 
Upvote 0

A​
B​
C​
D​
E​
1​
Name
Avg
SD
Pwin
(Workbook) Pins Refers To: =ROW(INDIRECT("1:300"))
2​
John​
200​
15​
69.231%​
D2: =SUMPRODUCT(NORM.DIST(Pins, $B$2, $C$2, FALSE), NORM.DIST(Pins, $B$3, $C$3, TRUE), NORM.DIST(Pins, $B$4, $C$4, TRUE), NORM.DIST(Pins, $B$5, $C$5, TRUE))
3​
Paul​
180​
16​
13.930%​
D3: {=SUMPRODUCT(NORM.DIST(Pins, $B$2, $C$2, TRUE), NORM.DIST(Pins, $B$3, $C$3, FALSE), NORM.DIST(Pins, $B$4, $C$4, TRUE), NORM.DIST(Pins, $B$5, $C$5, TRUE))}
4​
George​
175​
18​
10.381%​
D4: {=SUMPRODUCT(NORM.DIST(Pins, $B$2, $C$2, TRUE), NORM.DIST(Pins, $B$3, $C$3, TRUE), NORM.DIST(Pins, $B$4, $C$4, FALSE), NORM.DIST(Pins, $B$5, $C$5, TRUE))}
5​
Ringo​
170​
18​
6.458%​
D5: {=SUMPRODUCT(NORM.DIST(Pins, $B$2, $C$2, TRUE), NORM.DIST(Pins, $B$3, $C$3, TRUE), NORM.DIST(Pins, $B$4, $C$4, TRUE), NORM.DIST(Pins, $B$5, $C$5, FALSE))}
6​
100.000%
 
Last edited:
Upvote 0
shg - the first formula is shown as a normal sumproduct, while the other three are entered as array functions. I recreated them and entered them all as normal functions, and they provide the same results. Is there are reason shy they were array functions?
 
Upvote 0
Operator error. I was using a different formula that did need to be array-entered and didn't change the last three.

Also,

o The formula gives the probability of each bowler winning or tying; a slight change would calculate the probability of an outright win

o The sum is (slightly) less than 100% because if the distributions were truly normal they could have scores < 0 or > 300
 
Last edited:
Upvote 0

Forum statistics

Threads
1,215,035
Messages
6,122,791
Members
449,095
Latest member
m_smith_solihull

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top