Results 1 to 5 of 5
Like Tree1Likes
  • 1 Post By JackBean

Generating a skewed normal distribution of random numbers

This is a discussion on Generating a skewed normal distribution of random numbers within the Excel Questions forums, part of the Question Forums category; Hi, I need to randomly generate a set of 5000 numbers using just the following three inputs: >Min >Mean >Max ...

  1. #1
    New Member
    Join Date
    Jun 2010
    Posts
    2

    Default Generating a skewed normal distribution of random numbers

    Hi,

    I need to randomly generate a set of 5000 numbers using just the following three inputs:
    >Min
    >Mean
    >Max
    They have to be normally distributed, but as the mean is never exactly half way between the min and max, the distribution will be skewed.

    I am using =(NORMSINV(RAND())*0.13)+0.5 to give me 5000 random numbers that are normally distributed with a mean of 0.5 and (almost always) have a min of 0 and a max of 1.

    I am then mapping these to the distribution that I require using
    If random number < 0.5, f(x)=min+2*rand*(mean-min)
    If random number = 0.5, f(x)=mean
    If random number > 0.5, f(x)=mean+2*(rand-0.5)*(max-mean)

    As expected this is giving me numbers within the predefined bounds, however the mean is way off what I was expecting.

    For example, a min of 50,000, mean of 225,000 and a max of 1,000,000 give the following output:
    >Min ~ 50,000-60,000 (this is close enough)
    >Mean ~ 288,000-290,000 (this is no where near)
    >Max ~ 950,000-1,000,000 (this is close enough)

    I am not sure if there is a problem with my mapping formula, if it is to do with how I am generating the random numbers in the first place, or something else, but I am now stuck.

    If anyone has any experience of this, or can offer any advice, then I would be most grateful!

    Many thanks in advance,

    Graham

  2. #2
    New Member
    Join Date
    Jun 2010
    Posts
    2

    Default Re: Generating a skewed normal distribution of random numbers

    Update:
    So, I've worked out that because of the skew of the distribution, the "mean" that I have been specifying is actually the mode. When I grouped the random numbers generated in the example and grouped them, the mode is 225,000 or there abouts.

    So, all I now need to do, I think(!), is generate a formula that will take my desired mean, and convert it into a mode that when plugged in to the mapping that I stated, give me a set of data with my "desired mean" as the mean...

  3. #3
    Board Regular
    Join Date
    Nov 2007
    Location
    USA
    Posts
    403

    Default Re: Generating a skewed normal distribution of random numbers

    Generating numbers randomly based on the normal distribution is a common question.

    Random Number based on Normal Distribution
    http://www.mrexcel.com/forum/showthread.php?t=465705
    Also, this has info on:
    Named Formulas for chart ranges
    Conditional Formatting Formulas

    This has also come up with the lognormal distribution.
    Re: Lognormal Distribution being applied to IQR
    http://www.mrexcel.com/forum/showthread.php?t=473530

    I am not familiar with your method of making a skewed normal distribution by 'mapping' the numbers from the normal to the skewed normal.

    Here is how I would generate numbers randomly based on the skewed normal distribution.

    Use the Probability Distribution Function (PDF) for the skewed normal distribution, with the layout as shown below and in the above reference.

    =NORMDIST(D5,B$2,B$1,0)*NORMDIST(D5*skew,B$2,B$1,1)*2

    The skew can range from plus or minus infinity and if skew = 0 then it is the plain normal distribution, as shown below for STDin = 4 and AVGin = 0, and 'Select' 100 at 20 points, or bins.

    Input normal PDF
    skew = 0


    Now change skew from 0 to 3:

    ABCDEF
    1STDin4Start-5STDout2.3928564
    2AVGin0End15AVGout3.0526316
    3Select100# Bins20NormSum0.9499057
    4NormDistNormalizeCumulativeBinCountDraw
    58.075E-068.5E-060-501.3157895
    60.00018820.00019828.5E-06-3.94736805.5263158
    70.00229720.00241830.0002067-2.89473702.3684211
    80.01498910.01577960.002625-1.84210501.3157895
    90.05416620.05702270.0184045-0.78947458.6842105
    100.11509090.12116040.07542730.2631579113.4210526
    110.15837980.16673220.19658771.3157895170.2631579
    120.16106380.16955760.36331982.3684211237.6315789
    130.13765790.14491750.53287753.4210526160.2631579
    140.10668070.11230660.67779494.473684283.4210526
    150.07680520.08085560.79010155.526315856.5789474
    160.05157770.05429770.87095726.578947482.3684211
    170.03231850.03402280.92525497.631578934.4736842
    180.01889570.01989220.95927778.684210531.3157895
    190.01030860.01085220.97916999.736842111.3157895
    200.00524760.00552430.990022110.78947404.4736842
    210.00249260.0026240.995546411.84210501.3157895
    220.00110470.0011630.998170512.89473701.3157895
    230.00045690.0004810.999333413.94736802.3684211
    240.00017630.00018560.99981441504.4736842
    250.2631579
    262.3684211
    271.3157895
    284.4736842
    292.3684211
    30-0.789474
    314.4736842
    322.3684211
    332.3684211
    341.3157895
    353.4210526
    362.3684211
    373.4210526
    380.2631579
    392.3684211
    402.3684211
    413.4210526
    422.3684211
    435.5263158
    444.4736842

    The Cell Formula in Cell F5 is copied down through F104.
    Worksheet Formulas
    CellFormula
    A5=NORMDIST(D5,B$2,B$1,0)*NORMDIST(D5*3,B$2,B$1,1)*2
    B5=A5/F$3
    C5=SUM(B4:B$5)*(ROW()>5)
    D5=D$1+(ROW()-5)*(D$2-D$1)/(D$3-1)
    E5=COUNTIF(F$5:INDEX(F:F,B$3+4),D5)
    F5=VLOOKUP(RAND(),C$5:INDEX(D:D,D$3+4),2)

    Worksheet Formulas
    CellFormula
    F1=STDEVP(F5:INDEX(F:F,B3+4))
    F2=AVERAGE(F5:INDEX(F:F,B3+4))
    F3=SUM(A5:INDEX(A:A,D3+4))

    Input skew normal PDF
    skew = 3


    Here is a set of output numbers with 'Select' = 100


    Here is a set of output numbers with 'Select' = 5000


    Keep in mind that if you generate a set of numbers at random, the mean of the output will vary with each new set of numbers.
    But with a large set of numbers the mean may not change much.


    I found that I got strange results, like a cut-off skewed distribution, if AVGin <> 0.

    It seems to me that you should be able to create a distribution that you want, and map it by shifting it horizontally to give the desired average.

    References for skew normal distribution:
    Wikipedia
    And also:
    http://azzalini.stat.unipd.it/SN/

    WinzarH likes this.

  4. #4
    MrExcel MVP tusharm's Avatar
    Join Date
    May 2002
    Posts
    11,007

    Default Re: Generating a skewed normal distribution of random numbers

    A standard normal distribution goes, *by definition,* from -infinity to +infinity. Multiplying by 0.13 changes the variability from 1 to 0.13 and adding 0.5 shifts the mean from 0 to 1 but you still get values between +/- infinity!

    Using your mapping function, 1/2 of the times (when your normal random number < 0.5) your formula min+2*rand*(mean-min) will generate a *uniformly* distributed random number between min and 2*mean-min. Similarly, you will generate a different random number that too will be uniformly distributed when your first normal random variable is > 0.5

    When you generate random numbers from a specified distribution, the distribution represents the population and the resulting numbers represent a sample. The sample statistics (like mean, variance, etc.) are *representative* of the population statistics and the two do not necessarily match. In fact, if they were to match, I would be suspicious of how the sample was created.

    There are ways to generate a normally distributed sample with a specified min. and max. It's called a truncated normal. You can search google or bing for more on how to generate random variables from a truncated normal distribution.
    Quote Originally Posted by Graham.Johnson View Post
    Hi,

    I need to randomly generate a set of 5000 numbers using just the following three inputs:
    >Min
    >Mean
    >Max
    They have to be normally distributed, but as the mean is never exactly half way between the min and max, the distribution will be skewed.

    I am using =(NORMSINV(RAND())*0.13)+0.5 to give me 5000 random numbers that are normally distributed with a mean of 0.5 and (almost always) have a min of 0 and a max of 1.

    I am then mapping these to the distribution that I require using
    If random number < 0.5, f(x)=min+2*rand*(mean-min)
    If random number = 0.5, f(x)=mean
    If random number > 0.5, f(x)=mean+2*(rand-0.5)*(max-mean)

    As expected this is giving me numbers within the predefined bounds, however the mean is way off what I was expecting.

    For example, a min of 50,000, mean of 225,000 and a max of 1,000,000 give the following output:
    >Min ~ 50,000-60,000 (this is close enough)
    >Mean ~ 288,000-290,000 (this is no where near)
    >Max ~ 950,000-1,000,000 (this is close enough)

    I am not sure if there is a problem with my mapping formula, if it is to do with how I am generating the random numbers in the first place, or something else, but I am now stuck.

    If anyone has any experience of this, or can offer any advice, then I would be most grateful!

    Many thanks in advance,

    Graham

  5. #5
    New Member
    Join Date
    Jul 2012
    Posts
    1

    Default Re: Generating a skewed normal distribution of random numbers

    I tried out the following, and the results are good. If you have larger numbers the mean is usually just a few percentage points off.

    $B$1 = max value
    $B$2 = min value
    $B$3 = desired mean

    =$B$2+($B$1-$B$2)*POWER(MIN(MAX(-3,STANDNORMINV(RAND())),3)/6+0.5,POWER(($B$1-B$3)/(B$3-$B$2),0.7))

    How this works:
    - In the middle is the Random Number from excel, evenly distributed
    - From this I get the inverse of the standard normal distribution (result from minus to plus infinity)
    - this I then reduce to an interval from -3 to +3 by using Excel MIN/MAX
    - then I divide by six to get to an interval from -0.5 to +0.5
    - (Result #1) then I add 0.5 to get to an interval from 0 to 1
    - for the skew I use the second POWER term in where I divide the upper interval from my mean to my max by the lower interval from my min to my mean; the result is the relation between these two intervals
    - I tried this out for the mean being somewhere between 25% and 75% of the interval between my max and my min
    - (Result #2) the relation is then my base number; from trial-and-error I use 0.7 as the exponent; that brought me a small difference between the desired mean and the mean from the distribution
    - (Result #3) finally I use Result #1 as base and Result #2 as exponent in another POWER statement
    - this Result #3 I then use to calculate a part of the difference between my max and my min and add this part to my min

    Here is how the distribution looks like:
    https://docs.google.com/open?id=0Bwe...XlHMkFuakM5Rjg

    And here are the results if I draw 10000 random numbers.
    https://docs.google.com/open?id=0Bwe...E5rdlhIZ0R4MWc


    Quote Originally Posted by Graham.Johnson View Post
    Hi,

    I need to randomly generate a set of 5000 numbers using just the following three inputs:
    >Min
    >Mean
    >Max
    They have to be normally distributed, but as the mean is never exactly half way between the min and max, the distribution will be skewed.

    I am using =(NORMSINV(RAND())*0.13)+0.5 to give me 5000 random numbers that are normally distributed with a mean of 0.5 and (almost always) have a min of 0 and a max of 1.

    I am then mapping these to the distribution that I require using
    If random number < 0.5, f(x)=min+2*rand*(mean-min)
    If random number = 0.5, f(x)=mean
    If random number > 0.5, f(x)=mean+2*(rand-0.5)*(max-mean)

    As expected this is giving me numbers within the predefined bounds, however the mean is way off what I was expecting.

    For example, a min of 50,000, mean of 225,000 and a max of 1,000,000 give the following output:
    >Min ~ 50,000-60,000 (this is close enough)
    >Mean ~ 288,000-290,000 (this is no where near)
    >Max ~ 950,000-1,000,000 (this is close enough)

    I am not sure if there is a problem with my mapping formula, if it is to do with how I am generating the random numbers in the first place, or something else, but I am now stuck.

    If anyone has any experience of this, or can offer any advice, then I would be most grateful!

    Many thanks in advance,

    Graham

Like this thread? Share it with others

Like this thread? Share it with others

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


DMCA.com