Is this formula possible?


New Member
Sep 5, 2002
I am trying to create a random number generator that is based on, I think, a standard deviation. My basic RNF is "rand(int()*X+Y", where X equals the seed number, and y equals any number, usually 1, to return values greater than zero. What I want to do is have X reference the mean (median?) of a range so that the random number generated remains in the range. I don't even know if this is the correct formula or not. My actual example will be sports statistics, specifically, the number of attempts of a certain player based on previous data, i.e., John Doe attempts 500 passes a season, or 31.25 per game. My desired formula will generate a random number that will fall between 18-55 passes per game, but will be weighted by the attempts per game average (whereas the weight would be lower if the numebr of pass attempts was 20 per game). I hope this isn't too confusing, and I appreciate any help I can get. Thank you in advance.

Excel Facts

Will the fill handle fill 1, 2, 3?
Yes! Type 1 in a cell. Hold down Ctrl while you drag the fill handle.
Hi Steve,

You didn't mention it in your question, but I get the impression that you would like your random numbers to have a Normal distribution (i.e., "Gaussian"). The RAND function produces a Uniform distribution (NOT the Normal distribution's bell-shaped curve). If random numbers from a Normal distribution is what you are after, then an extra step must be used to convert the numbers from the RAND function to Gaussian, then modify them to have the desired mean and standard deviation.

If this is what you are after, post a reply letting me know and I'll follow up with a user-defined function (UDF) that you can use to generate a random number from a Normal distribution. There also is a "Random Number Generation analysis tool" in Excel (search helps under this heading) that can be used to fill a range of cells with Normally-distributed numbers, or numbers from several other distributions.
Upvote 0
Many Thanks! I'm not sure I can explain it with the correct mathematical terms, but perhaps this example can help. Let's assume that I want to produce random numbers within a defined range, say, 1-10. Then, I want to control that randomization with a second factor, so that while the range stays absolute, the values are returned based on the second factor. Assume the range from my first post, 18-55. Now, if a passer averages 35 passes per game, and you try to create a random number of passes within the above range, the returns should be higher on average than a passer who only averaged 22 passes per game. I can see the curve on a graph in my mind; the peak is greater for the first passer than the second, but the same results are possible, but to obviously different degrees. I hope this clarified my point, and thanks again, very much!
Upvote 0
Hi Steve,

I hope you will bear with me a bit on this. The issue of which distribution is an important one. If you use the output of the RAND function directly you will get a uniformly distributed number. You can easily get this number to be a whole number that falls between 18 and 55 by multiplying RAND by 37, adding 19, and truncating to an integer. This will not give you what you want, however, because it will be equally likely to produce all numbers in the range, so it will be just as likely to produce an 18 or 55 as is is to produce a 36 or 37 value. In reality the number of passes is more likely to be near the average and less likely to be at the high or low extremes. The Normal distribution DOES have this characteristic (as do a number of others, but the Normal distribution is the most popular because an enormous number of natural processes are by their very nature distributed in this exact way), and I believe that is the one that you should probably use. If you have a set of actual data (such as league or individual player stats) there are statistical tests that can be performed on it to determine if it is really the right distribution to use.

But assuming that the Normal distribution IS the way to go, you need two pieces of information: the mean (which you already have--each player's average), and the standard deviation (which I am not sure you have). The standard deviation is the square root of the average variance of the data values from the mean. It is a measure of the spread of the values about the mean, like your range of 18-55 representing a maximum variance of 18.5 from the mean of 36.5. So there are two issues here:

1. is the 18.5 a maximum possible variance, meaning that there is no physical way a number beyond this range could ever occur? or is it something like a 3-sigma value, meaning that 99.7% of the data values fall within this range?

2. a normal distribution will OCCASIONALLY produce numbers beyond even the 3-sigma range. It has no hard limits on the values it will produce. Would it be acceptable for the random number generator to occasionally produce a number less than 18 or greater than 55? Of course, if this were to occur your code could reject the number and "roll the dice" again, but it is a question that must be addressed.

If you can come up with a value for the standard deviation, and have an answer to what to do with values lower than 18 and higher than 55, then you can use the Normal distribution. And again, I would be happy to provide a UDF, or you could use the built-in capability to fill a range with these numbers.

Upvote 0
Thank you again for your help! The 3-sigma variance is fine, and I would adjust the standard deviation based on the number of attempts-while I would never use a bottom number less than 10, passers with very high averages would have high top numbers, for the odd chance of a reult greater than 50. Is that the right information? I know what I'm looking for, but I lack the mathematical acumen to describe it correctly.
Upvote 0

Damon was referring to the Random Number Generator in the Analysis ToolPak in one part of his discussion. It will allow you to choose many different distributions.

For your purposes it would be OK, I believe, but I have read that there are problems with the RNG, although that is for serious statistical work.

With the ATP loaded, Tools>Data Analysis>Random Number Generator and choose from a bunch of options.

Another option would be to use the NORMINV function with a random probability...

=NORMINV(Rand(),mean,standard deviation)

The recommendation from the experts (Mike Middleton, Jerry W. Lewis and Dave Braden among others -- three stats gurus who are also Excel wizards) is that NORMINV has problems in versions prior to XP. That said, these guys need a degree of accuracy and randomness far beyond what you or I would likely need.

If you don't have XP and want a "better" function, the following UDF should work...<pre>Function RandomNormal(Optional Mean As Double = 0, _
Optional SD As Double = 1) As Double

' posted by Myrna Larson
' microsoft.public.excel.programming
' 24 Aug 1999

'returns a normally distributed random number
'distribution mean = Mean, sigma = SD

Static Initialized As Boolean
Static TwoPi As Double
Static Have2nd As Boolean
Static X1 As Double, X2 As Double


If Initialized = False Then
TwoPi = 8 * Atn(1)
Have2nd = False
Randomize Timer
Initialized = True
End If

If Have2nd = False Then
X1 = TwoPi * Rnd
X2 = Sqr(-2 * Log(Rnd))
RandomNormal = Cos(X1) * X2 * SD + Mean
Have2nd = True
'return 2nd value from previous call
RandomNormal = Sin(X1) * X2 * SD + Mean
Have2nd = False
End If
End Function</pre>

Damon, if you have a UDF already coded, would you please post it? I, for one, would like to see it as it is probably very nice.

Finally, it took a while, but I found one post I was looking for...$b5e98940$

That should help with a bounded Normal.


EDIT: One other RandomNormal routine I found. The author is Dave Braden.<pre>Function RandomNormal(Optional Mean As Double = 0, _
Optional SD As Double = 1) As Double

Static HaveX1 As Boolean
Dim V1 As Double, V2 As Double, S As Double
Static X1 As Double
Dim X2 As Double

If HaveX1 = False Then
V1 = Rnd() * 2 - 1
V2 = Rnd() * 2 - 1
S = V1 * V1 + V2 * V2
Loop While S >= 1# Or S = 0#

S = Sqr(-2 * Log(S) / S)
X1 = V1 * S
X2 = V2 * S

HaveX1 = True
RandomNormal = X2 * SD + Mean
HaveX1 = False
RandomNormal = X1 * SD + Mean
End If
End Function 'RandomNormal</pre>
This message was edited by Jay Petrulis on 2002-09-16 17:31
Upvote 0
Hi Jay,

I can see from the code you posted that it is based on the same algorithm I would have used, so I don't think I have anything additional to offer here. I believe you "have the waterfront covered." Nice job.

Keep Excelling.

Upvote 0
Many thanks to you both! I used the NORMINV function as written above, and it works fine. My next quest will be to tweak the standard deviation to limit the return values so as not to fall outside my min/max range. I really appreciate it!
Upvote 0

Forum statistics

Latest member

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back