How to generate weighted random numbers?

JenniferMurphy

Well-known Member
Joined
Jul 23, 2011
Messages
2,535
Office Version
  1. 365
Platform
  1. Windows
I know a couple of ways to generate weighted random numbers.

If the weights are all integers and there are not too many of them, I can generate a list and randomly select an element from that list. For example, if the weights are
2 4 5 7 9
then the list would be 2 1s, 4 2s, 5 3s, 7 4s, and 9 5s:
1 1 2 2 2 2 3 3 3 3 3 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5
Then I just generate a random integer on [1,27]. If that number is 17, for example, the random value would be 4.

If the weights are not integers, I can generate a cumulative list. For example, if the weights are
0.9 0.7 0.5 0.2 0.1
The cumulative list would be
0.9 1.6 2.1 2.3 2.4
I can generate a random number on [0, 2.4] and then loop through the list until I find the upper limit. For example, if the random number is 1.937562, that would fall between 1.6 and 2.1, so the random number would be 2.

Both of these approaches have limitations. It seems to me that some time ago I read about another method that uses a 2-dimensional array that removed most of these limitations. Can anyone provide a link to that method? Or is there another method that will work with large numbers of non-integer weights?

Thanks
 
JenniferMurphyWeightedDrawings.xlsm
BCDEFGHIJKLMNOPQRSTUVWX
1
2
340Number of results
4
51234567891011121314151617181920
6Weights0.07550.01790.22220.01540.15380.15380.02220.01540.20000.01110.02220.02940.04940.00720.01180.01530.00830.00600.03900.0052
7Multiplier10000
8Weights as Integers75517922221541538153822215420001112222944947211815383603905210811
9% Expected7.0%1.7%20.6%1.4%14.2%14.2%2.1%1.4%18.5%1.0%2.1%2.7%4.6%0.7%1.1%1.4%0.8%0.6%3.6%0.5%100.0%
10StringsTally
11Actual % Received3 9 9 13 1 9 3 6 3 12 2 16 5 1 5 11 7 13 5 3 5 5 9 3 5 3 5 12 9 9 3 6 1 9 9 3 5 5 3 123 1 9 0 9 2 1 0 8 0 1 3 2 0 0 1 0 0 0 0
12Actual % Received1 1 5 19 1 9 3 6 9 10 3 1 3 1 3 5 6 3 7 3 18 9 6 19 3 9 13 9 3 6 5 6 13 12 6 19 10 3 19 65 0 9 0 3 7 1 0 5 2 0 1 2 0 0 0 0 1 4 0
13Actual % Received6 6 8 5 6 6 3 1 1 2 13 13 3 9 8 9 9 6 9 13 1 3 3 3 4 3 3 11 1 17 9 5 5 5 9 9 9 11 3 64 1 8 1 4 6 0 2 8 0 2 0 3 0 0 0 1 0 0 0
14Actual % Received9 9 9 9 1 1 5 9 13 3 6 19 13 3 5 3 9 9 9 9 5 16 3 6 19 6 5 3 5 1 5 9 9 9 6 8 3 9 3 53 0 7 0 7 4 0 1 13 0 0 0 2 0 0 1 0 0 2 0
15Actual % Received1 6 3 19 5 13 3 1 5 6 9 5 5 6 5 13 6 5 6 9 9 6 6 6 5 15 13 5 6 7 2 6 1 9 5 9 9 6 6 63 1 2 0 9 13 1 0 6 0 0 0 3 0 1 0 0 0 1 0
16Actual % Received19 1 6 13 5 9 19 9 6 9 6 3 3 13 3 9 9 1 6 3 3 6 3 3 6 9 5 3 13 5 5 6 6 20 8 5 9 3 5 22 1 9 0 6 8 0 1 7 0 0 0 3 0 0 0 0 0 2 1
17Actual % Received9 3 5 3 3 1 10 9 6 3 6 19 9 5 7 14 3 19 3 5 18 3 5 6 3 9 9 20 6 13 5 3 2 3 9 16 9 11 8 91 1 10 0 5 4 1 1 8 1 1 0 1 1 0 1 0 1 2 1
18Actual % Received7 3 10 1 3 11 6 3 19 9 5 5 19 3 6 6 16 11 3 5 8 6 1 3 3 13 3 1 1 10 3 6 9 5 5 1 4 5 3 195 0 10 1 6 5 1 1 2 2 2 0 1 0 0 1 0 0 3 0
19Actual % Received17 3 3 9 3 9 3 5 9 12 5 13 8 13 12 13 12 9 9 3 3 11 12 3 5 12 19 5 6 3 5 9 19 9 9 5 15 6 9 30 0 9 0 6 2 0 1 9 0 1 5 3 0 1 0 1 0 2 0
20Actual % Received9 7 3 1 9 3 5 1 3 1 9 3 12 6 13 5 3 9 9 5 9 3 19 14 10 9 3 1 3 5 9 6 1 5 3 3 3 9 14 35 0 12 0 5 2 1 0 9 1 0 1 1 2 0 0 0 0 1 0
21
Sheet2
Cell Formulas
RangeFormula
C8C8=ROUND(C6*C7+0.000001,0)
D8D8=ROUND(D6*C7+0.000001,0)
E8E8=ROUND(E6*C7+0.000001,0)
F8F8=ROUND(F6*C7+0.000001,0)
G8G8=ROUND(G6*C7+0.000001,0)
H8H8=ROUND(H6*C7+0.000001,0)
I8I8=ROUND(I6*C7+0.000001,0)
J8J8=ROUND(J6*C7+0.000001,0)
K8K8=ROUND(K6*C7+0.000001,0)
L8L8=ROUND(L6*C7+0.000001,0)
M8M8=ROUND(M6*C7+0.000001,0)
N8N8=ROUND(N6*C7+0.000001,0)
O8O8=ROUND(O6*C7+0.000001,0)
P8P8=ROUND(P6*C7+0.000001,0)
Q8Q8=ROUND(Q6*C7+0.000001,0)
R8R8=ROUND(R6*C7+0.000001,0)
S8S8=ROUND(S6*C7+0.000001,0)
T8T8=ROUND(T6*C7+0.000001,0)
U8U8=ROUND(U6*C7+0.000001,0)
V8V8=ROUND(V6*C7+0.000001,0)
W8:W9W8=SUM(C8:V8)
C9C9=C6/SUM(C6:V6)
D9D9=D6/SUM(C6:V6)
E9E9=E6/SUM(C6:V6)
F9F9=F6/SUM(C6:V6)
G9G9=G6/SUM(C6:V6)
H9H9=H6/SUM(C6:V6)
I9I9=I6/SUM(C6:V6)
J9J9=J6/SUM(C6:V6)
K9K9=K6/SUM(C6:V6)
L9L9=L6/SUM(C6:V6)
M9M9=M6/SUM(C6:V6)
N9N9=N6/SUM(C6:V6)
O9O9=O6/SUM(C6:V6)
P9P9=P6/SUM(C6:V6)
Q9Q9=Q6/SUM(C6:V6)
R9R9=R6/SUM(C6:V6)
S9S9=S6/SUM(C6:V6)
T9T9=T6/SUM(C6:V6)
U9U9=U6/SUM(C6:V6)
V9V9=V6/SUM(C6:V6)
 
Upvote 0

Excel Facts

Lock one reference in a formula
Need 1 part of a formula to always point to the same range? use $ signs: $V$2:$Z$99 will always point to V2:Z99, even after copying
Interesting. Without going to the trouble of posting the code. Can you briefly describe the algorithm?

I see that you converted the fractions to integers by multiplying by 10,000. That is the equivalent of 4 decimal places of accuracy, which is plenty. But then what? Did you generate a cumulative list, like this?

Sorting.xlsx
ABCDEFGHIJKLMNOPQRSTUV
8Multiplier10000
9ItemABCDEFGHIJKLMNOPQRSTSum
10Value0.07550.01790.22220.01540.15380.15380.02220.01540.20000.01110.02220.02940.04940.00720.01180.01530.00830.00600.03900.00521.0811
11Prob6.981%1.652%20.555%1.423%14.230%14.230%2.055%1.423%18.499%1.028%2.055%2.721%4.568%0.665%1.088%1.412%0.771%0.557%3.604%0.482%100.00%
12Cum Prob755934315633104848638666086762876288739095938998839955100731022610309103691075910811
Sheet3
Cell Formulas
RangeFormula
V10:V11V10=SUM(B10:U10)
B11:U11B11=B10/$V$10
B12B12=ROUND(B10*Multiplier,0)
C12:U12C12=ROUND(C10*Multiplier,0)+B12
Named Ranges
NameRefers ToCells
Multiplier=Sheet3!$B$8B12:U12


Then did you generate a random number on [1,10811] =rnd()*10811, then loop through the cum list until you found one that was smaller than you random number?
 
Upvote 0
Ok, I finally got a chance to get back at this. I made some changes to attempt to handle the further information you have provided. The code will calculate much of the info for you.

This is the sheet that I start with:

JenniferMurphyWeightedDrawings.xlsm
ABCDEFGHIJKLMNOPQRSTUVWX
1
2
3Number of results40
4
51234567891011121314151617181920
6Weights0.07550.01790.22220.01540.15380.15380.02220.01540.20000.01110.02220.02940.04940.00720.01180.01530.00830.00600.03900.0052
7Multiplier10000
8Weights as Integers755179222215415381538222154200011122229449472118153836039052
9% Expected7.0%1.7%20.6%1.4%14.2%14.2%2.1%1.4%18.5%1.0%2.1%2.7%4.6%0.7%1.1%1.4%0.8%0.6%3.6%0.5%
10
11StringsTallysActual %s Received
123 6 9 5 16 1 9 19 9 3 5 3 9 19 2 6 5 12 6 3 3 12 3 1 1 3 5 9 9 1 8 9 5 3 5 9 3 3 6 34 1 11 0 6 4 0 1 8 0 0 2 0 0 0 1 0 0 2 010.00%2.50%27.50%0.00%15.00%10.00%0.00%2.50%20.00%0.00%0.00%5.00%0.00%0.00%0.00%2.50%0.00%0.00%5.00%0.00%
13
Sheet2


The heavily commented code that goes with it is:
VBA Code:
Sub TestDecimalWeights()
'
    Dim ClearPreviousResults                As Boolean
    Dim SumOfWeights                        As Double
    Dim ArrayRow                            As Long, WeightRow                  As Long
    Dim RowCount                            As Long, RunCount                   As Long
    Dim ColumnNumber                        As Long, InnerColumnNumber          As Long
    Dim Iterations                          As Long, Multiplier                 As Long
    Dim RandomInteger                       As Long, SumOfIntegers              As Long
    Dim Dict                                As Object
    Dim ActualPercentageColumn              As String, LookupValueResultsColumn As String, TallyStringColumn    As String
    Dim WeightLastColumn                    As String, WeightStartColumn        As String
    Dim LookupValueResultColumnStartAddress As String, MultiplierAddress        As String, RunCountAddress      As String
    Dim LookupValueString                   As String, TallyString              As String
    Dim IntegerArray                        As Variant, WeightArray             As Variant
    Dim LookupArray                         As Variant, PercentExpectedArray    As Variant, TallyPercentArray   As Variant
'
    ClearPreviousResults = False                                                                        ' <--- Set this to True if you want clear the previous results each time, otherwise False
    WeightRow = 6                                                                                       ' <--- Set this to the row # that the weights will be stored on
    LookupValueResultsColumn = "B"                                                                      ' <--- Set this to the column to display the LookupValueResults
           TallyStringColumn = "C"                                                                      ' <--- Set this to the column letter to display the TallyString into
      ActualPercentageColumn = "D"                                                                      ' <--- Set this to the column to display the ActualPercentages received to
           WeightStartColumn = "D"                                                                      ' <--- Set this to the column letter that the weights will start on
    LookupValueResultColumnStartAddress = "B12"                                                         ' <--- Set this to the start address of the LookupValueResults
                      MultiplierAddress = "D7"                                                          ' <--- Set this to the address of the Multiplier
                        RunCountAddress = "C3"                                                          ' <--- Set this to the column that contains the # of iterations to execute
'
'-------------------------------------------------------------------------------------------------------
'
' Gather data needed from the sheet
    Iterations = Range(RunCountAddress).Value                                                           ' Get the # of iterations
    Multiplier = Range(MultiplierAddress).Value                                                         ' Get the Multiplier value
'
    WeightLastColumn = Split(Cells(Rows(WeightRow).Find("*", , xlFormulas, , xlByColumns, _
            xlPrevious).Column).Address, "$")(1)                                                        ' Calculate the column letter of the last weight on the sheet
'
    WeightArray = Range(WeightStartColumn & WeightRow & ":" & WeightLastColumn & WeightRow)             ' Save the weights into 2D 1 Based WeightArray
'
'-------------------------------------------------------------------------------------------------------
'
' Establish some arrays that will be needed
            ReDim IntegerArray(1 To UBound(WeightArray, 1), 1 To UBound(WeightArray, 2))                ' Set IntegerArray to the same dimensions as WeightArray
       ReDim TallyPercentArray(1 To UBound(WeightArray, 1), 1 To UBound(WeightArray, 2))                ' Set TallyPercentArray to the same dimensions as WeightArray
    ReDim PercentExpectedArray(1 To UBound(WeightArray, 1), 1 To UBound(WeightArray, 2))                ' Set PercentExpectedArray to the same dimensions as WeightArray
'
' Convvert the weights to integer equivalent, Sum the weights, Sum the integer equivalents
    For ColumnNumber = 1 To UBound(WeightArray, 2)                                                      ' Loop through the weights
        IntegerArray(1, ColumnNumber) = Round(WeightArray(1, ColumnNumber) * Multiplier, 0)             '   Convert the weight to an integer & save it into IntegerArray
        SumOfWeights = SumOfWeights + WeightArray(1, ColumnNumber)                                      '   Add the weight to SumOfWeights
        SumOfIntegers = SumOfIntegers + IntegerArray(1, ColumnNumber)                                   '   Add the integer to SumOfIntegers
    Next                                                                                                ' Loop back
'
' Calculate the expected percentages
    For ColumnNumber = 1 To UBound(WeightArray, 2)                                                      ' Loop through the weights
        PercentExpectedArray(1, ColumnNumber) = WeightArray(1, ColumnNumber) / SumOfWeights             '   Calculate the percent expected & save it into PercentExpectedArray
    Next
'
' Write the integer equivalents to the sheet, Write the expected percentages to the sheet
    Range(WeightStartColumn & WeightRow).Offset(2).Resize(UBound(IntegerArray, 1), _
            UBound(IntegerArray, 2)) = IntegerArray                                                     ' Write the integers to the sheet
    Range(WeightStartColumn & WeightRow).Offset(3).Resize(UBound(PercentExpectedArray, 1), _
            UBound(PercentExpectedArray, 2)) = PercentExpectedArray                                     ' Write the PercentExpectedArray to the sheet
'
'-------------------------------------------------------------------------------------------------------
'
' Create the LookupArray, This will be a 'weighted' array of x amount of 1's, x amount of 2's, x amount of 3's, etc.
    ReDim LookupArray(1 To SumOfIntegers, 1 To 1)                                                       ' Establish the row/column size of LookupArray
'
'    RowCount = 0                                                                                        ' Initialize RowCount
'
    For ColumnNumber = 1 To UBound(IntegerArray, 2)                                                     ' Loop through columns of IntegerArray ... 1 to 5 for example
        For InnerColumnNumber = 1 To IntegerArray(1, ColumnNumber)                                      '   Loop to add loops of values to add to LookupArray
            RowCount = RowCount + 1                                                                     '       Increment RowCount
'
            LookupArray(RowCount, 1) = ColumnNumber                                                     '       Save ColumnNumber to LookupArray
        Next                                                                                            '   Loop back
    Next                                                                                                ' Loop back
'
'-------------------------------------------------------------------------------------------------------
'
' Get results
    LookupValueString = ""                                                                              ' Initialize LookupValueString
'
    Set Dict = CreateObject("Scripting.Dictionary")                                                     ' Establish the dictionary we will use
'
    Randomize                                                                                           ' Randomize the random number generator
'
    If ClearPreviousResults Then
        Range(LookupValueResultColumnStartAddress & ":" & Split(Cells(Cells.Find("*", , xlFormulas, , _
                xlByColumns, xlPrevious).Column).Address, "$")(1) & Cells.Find("*", , xlFormulas, , _
                xlByRows, xlPrevious).Row).ClearContents                                                ' Clear previous results
    End If
'
' PreLoad Dictionary values to expect
    For ArrayRow = 1 To UBound(IntegerArray, 2)                                                         ' Loop through columns of IntegerArray
        Dict.Add ArrayRow, 0                                                                            '   Initalize Key and Item
    Next                                                                                                ' Loop back
'
' Calculate random integer, save the random integer to LookupValueString, increment the counter for that random integer in the dictionary
    For RunCount = 1 To Iterations                                                                      ' Loop to get # of results determined by RunCountAddress value
        RandomInteger = Int((UBound(LookupArray, 1) - LBound(LookupArray, 1) _
                + 1) * Rnd + 1)                                                                         '   Calculate a random integer between 1 & MaxArrayRowSize
'
        LookupValueString = LookupValueString & LookupArray(RandomInteger, 1) & " "                     '   Save the lookup value to LookupValueString
'
        Dict.Item(LookupArray(RandomInteger, 1)) = _
                Dict.Item(LookupArray(RandomInteger, 1)) + 1                                            '   Increment the count of that Lookup value in the dictionary
    Next                                                                                                ' Loop back
'
' Save the tally counts to TallyString, Calculate the percentage those tally counts represent & save into TallyPercentArray
    For ArrayRow = 0 To Dict.Count - 1                                                                  ' Loop through the Tally values in the dictionary
        TallyString = TallyString & Dict.Items()(ArrayRow) & " "                                        '   Save the counter for the Lookup value to TallyString
        TallyPercentArray(1, ArrayRow + 1) = Format(Dict.Items()(ArrayRow) / Iterations, "0.0%")        '   Save tally percentage into TallyPercentArray
    Next                                                                                                ' Loop back
'
' Display the LookupValueString, TallyString, & TallyPercentArray to the sheet
    Range(LookupValueResultsColumn & Range(LookupValueResultsColumn & _
            Rows.Count).End(xlUp).Row + 1) = Trim(LookupValueString)                                    ' Display the LookupValueString to the sheet
    Range(TallyStringColumn & Range(TallyStringColumn & _
            Rows.Count).End(xlUp).Row + 1) = Trim(TallyString)                                          ' Display the TallyString to the sheet
    Range(ActualPercentageColumn & Range(ActualPercentageColumn & _
            Rows.Count).End(xlUp).Row + 1).Resize(UBound(TallyPercentArray, 1), _
            UBound(TallyPercentArray, 2)) = TallyPercentArray                                           ' Display the TallyPercentArray to the sheet
'
' Autofit the width of the columns on the sheet
    ActiveSheet.UsedRange.EntireColumn.AutoFit                                                          ' Autofit the width of the used columns
End Sub

You asked to explain how it works, if after reading the commented code, you still have questions, please ask.
 
Upvote 0
Ok, I finally got a chance to get back at this. I made some changes to attempt to handle the further information you have provided. The code will calculate much of the info for you.

You asked to explain how it works, if after reading the commented code, you still have questions, please ask.
Wow! It looks like you spent a lot of time on that. Incredible.

I looked it over quickly, but I will have to spend time to really understand it. I won't be able to get to it right away. My implementation is just about ready to go into "production" (sort of). I have to finish that first and then deal with whatever problems arise as it is used. I'll try to take a look as soon as I can.

Cheers
 
Upvote 0
Ok. Like I said previously, it should do the same thing you are attempting to do without all of those extras you are trying to incorporate.

Any questions, ask away.
 
Upvote 0
Johnny Boy,

I already learned something from reading your code that could make my code better. If the weights are not integers, your application of a multiplier plus rounding to convert them all to integers will eliminate any comparison problems related to floating point numbers that differ only in the last couple of digits. The user can choose the multiplier that will preserve the appropriate number of significant digits. And it's a fairly simple addition or it could be an option.

Your algorithm may require integers. I didn't get that far into it. Mine doesn't require integers, but it may benefit from them. I think I remember from my long ago comp sci classes that fixed point arithmetic is faster that floating point.

Maybe I'll add it as an option and then do some timings both ways with the same data.

Thanks 👍👏🙂
 
Upvote 0
I just completed my time testings for 20 weights & 100,000 iterations. Average time to complete was 1.2296875 seconds.

JenniferMurphyWeightedDrawings.xlsm
ABCDEFGHIJKLMNOPQRSTUVWX
1
2
3Number of results100000
4
51234567891011121314151617181920
6Weights0.11150.08880.06510.05670.08270.00530.00380.06330.05750.04700.07090.01880.03430.01770.06480.03320.07860.03310.02570.0412
7Multiplier10000
8Weights as Integers11158886515678275338633575470709188343177648332786331257412
9% Expected11.2%8.9%6.5%5.7%8.3%0.5%0.4%6.3%5.8%4.7%7.1%1.9%3.4%1.8%6.5%3.3%7.9%3.3%2.6%4.1%
10
11StringsTallysActual %s Received
1211060 8783 6507 5752 8293 558 357 6400 5751 4615 7238 1916 3480 1736 6423 3313 7818 3339 2483 417811.10%8.80%6.50%5.80%8.30%0.60%0.40%6.40%5.80%4.60%7.20%1.90%3.50%1.70%6.40%3.30%7.80%3.30%2.50%4.20%
13
Sheet2


You may notice the actual percentages received are pretty spot on to the expected percentages. :)
 
Upvote 0
I just completed my time testings for 20 weights & 100,000 iterations. Average time to complete was 1.2296875 seconds.
Was that 100,000 selections or 100,000 iterations of 40 selections?

You may notice the actual percentages received are pretty spot on to the expected percentages. :)
If that was for 100,000 selections, they d*mn well better be spot on! 😉
 
Upvote 0
I changed the 40 to 100,000 to yield 100,000 results.
 
Upvote 0

Forum statistics

Threads
1,215,241
Messages
6,123,823
Members
449,127
Latest member
Cyko

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top