How to generate weighted random numbers?

JenniferMurphy

Well-known Member
Joined
Jul 23, 2011
Messages
2,532
Office Version
  1. 365
Platform
  1. Windows
I know a couple of ways to generate weighted random numbers.

If the weights are all integers and there are not too many of them, I can generate a list and randomly select an element from that list. For example, if the weights are
2 4 5 7 9
then the list would be 2 1s, 4 2s, 5 3s, 7 4s, and 9 5s:
1 1 2 2 2 2 3 3 3 3 3 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5
Then I just generate a random integer on [1,27]. If that number is 17, for example, the random value would be 4.

If the weights are not integers, I can generate a cumulative list. For example, if the weights are
0.9 0.7 0.5 0.2 0.1
The cumulative list would be
0.9 1.6 2.1 2.3 2.4
I can generate a random number on [0, 2.4] and then loop through the list until I find the upper limit. For example, if the random number is 1.937562, that would fall between 1.6 and 2.1, so the random number would be 2.

Both of these approaches have limitations. It seems to me that some time ago I read about another method that uses a 2-dimensional array that removed most of these limitations. Can anyone provide a link to that method? Or is there another method that will work with large numbers of non-integer weights?

Thanks
 
Let me help you out with that answer ...


If the weights are not integers, I can generate a cumulative list. For example, if the weights are
0.9 0.7 0.5 0.2 0.1
The cumulative list would be
0.9 1.6 2.1 2.3 2.4
I can generate a random number on [0, 2.4] and then loop through the list until I find the upper limit. For example, if the random number is 1.937562, that would fall between 1.6 and 2.1, so the random number would be 2.


1.937562*10 = 19.37562, that would fall between 16 and 21, so the random number would be 2.
 
Last edited:
Upvote 0

Excel Facts

How can you turn a range sideways?
Copy the range. Select a blank cell. Right-click, Paste Special, then choose Transpose.
Hi,

I'm sorry that I'm still bothering with the same thing, but I checked out
your link "The Alias Method for Fast Weighted Random" and as far as I understand, the functions I posted
do "same" what the site presents.

Since your own solution is already being completed, I understand that there is no longer a direct need for this.
However, if you are interested testing the function:
I changed the structure of the function enough that it can be tested in the new workbook without any other preparations than adding VBA to the module.

Usage:
1. Create a new workbook
2. add new VBA module
3. Paste the VBA code into a new workbook
4. execute Use_Fu_Fill_RNG

' Example data set is just for fast testing and it's generate 100 weighted random number.
1 with weight 2/17
2 with weight 5/17
3 with weight 3/17
4 with weight 1/17
5 with weight 6/17


FillRows = how many draw (Example is 100, but at least 1 million rows are completed reasonably quickly.)
SourceRNG = where from found data (Not needed if only testing)
DestinationRNG = Where to write results (Not needed if only testing)

VBA Code:
Option Explicit
Public DataARR As Variant

Function Fu_SortARR(ByVal DataVAR As Variant, Optional CoINT As Integer) As Variant
Dim RowsARR As Variant, ReturnARR As Variant
Dim i As Long, j As Long, DataLNG As Long

    DataARR = DataVAR
    DataLNG = UBound(DataARR, 1)

ReDim RowsARR(1 To DataLNG)
For i = DataLNG To 1 Step -1
    RowsARR(i) = i
Next i

ReDim ReturnARR(1 To DataLNG, 1 To UBound(DataARR, 2))
For i = 1 To DataLNG
    For j = 1 To UBound(ReturnARR, 2)
        ReturnARR(i, j) = DataARR(RowsARR(i), j)
    Next j
Next i

Fu_SortARR = ReturnARR

End Function

Function Fu_Fill_RNG(SrcRNG As Range, FillRows As Long, Optional DstRNG As Range, Optional NoRepetion As Boolean = False, Optional There_is_Headers As Boolean = False) As Variant
On Error GoTo ErrHand: Application.Calculation = xlManual: Application.ScreenUpdating = False: Application.DisplayAlerts = False: Dim coT As Single: coT = Timer() ' Timer start
'Dim wb As Workbook, wsR As Worksheet, wsW As Worksheet
Dim i As Long, j As Long, iR As Long, iRt As Long, SrcARRSum As Long, SeaSNG As Single, TrgSNG As Single, Fill_Values As Long
Dim SrcARR As Variant, DstARR() As Variant, FillARR As Variant
'Dim TrueCounter As Long ' Only for Debuging
If There_is_Headers Then Set SrcRNG = SrcRNG.Offset(1, 0).Resize(SrcRNG.Rows.Count - 1, SrcRNG.Columns.Count)

If NoRepetion And FillRows > Application.Sum(SrcRNG.Columns(2)) Then FillRows = Application.Sum(SrcRNG.Columns(2))
Fill_Values = SrcRNG.Rows.Count

ReDim FillARR(1 To FillRows, 1 To 1) ' Fit FillARR for return values
'Dim TestARR As Variant: ReDim TestARR(1 To FillRows, 1 To 1) ' Fit TestARR for randomed values, only for test/debuging

Randomize
If SrcRNG.Columns.Count = 1 Then ' No Probability calculated, ALL VALUES HAVE EQUAL PROBABILITY!
    SrcARR = SrcRNG.value
    For i = LBound(FillARR) To UBound(FillARR)
        TrgSNG = Application.RoundDown(Rnd() * Fill_Values + 1, 0)
        FillARR(i, 1) = SrcARR(TrgSNG, 1)
        ' TestARR(i, 1) = TrgSNG ' Only for Debuging
    Next i

ElseIf SrcRNG.Columns.Count = 2 Then ' Probability numbers found from column(2) calculating probabilities, ALL VALUES HAVE OWN PROBABILITY!
        ReDim DstARR(1 To SrcRNG.Rows.Count, 1 To 4) ' Values and frequensy
        SrcRNG.Sort Key1:=SrcRNG.Cells(1, 2), Order1:=xlAscending ' SORTING SOURCE RANGE!
        SrcARR = SrcRNG.value ' Source Range to Array
        
        ' Summing all occurrences from Columns(2) for counting probabilities.
        With Application.WorksheetFunction
            SrcARRSum = .Sum(.Index(SrcARR, 0, 2))
        End With

    '***** Filling DstARR with SrcARR values AND adding probabilities.
        For i = LBound(SrcARR) To UBound(SrcARR)
            For j = LBound(SrcARR, 2) To UBound(SrcARR, 2)
                DstARR(i, j) = SrcARR(i, j)
                DstARR(i, j + 1) = Round(CSng(DstARR(i, 2) / SrcARRSum), 5)
                DstARR(i, j + 2) = Round(CSng(Application.Sum(Application.Index(DstARR, 0, 3))), 4)
            Next j
        Next i
        
    If Not NoRepetion Then
    '***** RANDOMING FILL VALUES! ***********************************************************************
        For i = LBound(FillARR) To UBound(FillARR)
            TrgSNG = Rnd()
            ' TestARR(i, 1) = TrgSNG ' Only for Debuging
            For j = LBound(DstARR) To UBound(DstARR)
                SeaSNG = DstARR(j, 4)
                If SeaSNG > TrgSNG Then
                    FillARR(i, 1) = DstARR(j, 1)
                    Exit For
                End If
            Next j
        Next i
    Else
    
        For i = LBound(FillARR) To UBound(FillARR)
            TrgSNG = Rnd()
            'TestARR(i, 1) = TrgSNG ' Only for Debuging
            
            For j = LBound(DstARR) To UBound(DstARR)
                SeaSNG = DstARR(j, 4)
                    If DstARR(j, 4) > TrgSNG And 0 < DstARR(j, 2) Then
                        'TrueCounter = TrueCounter + 1 ' Only for Debuging
                        FillARR(i, 1) = DstARR(j, 1)
                        DstARR(j, 2) = DstARR(j, 2) - 1
                        SrcARRSum = Application.Sum(Application.Index(DstARR, 0, 2))
                        
                            For iR = LBound(DstARR) To UBound(DstARR)
                                If DstARR(iR, 2) = 0 Then
                                    DstARR(iR, 3) = 0
                                Else
                                    DstARR(iR, 3) = Round(CSng(DstARR(iR, 2) / SrcARRSum), 5)
                                End If
                            Next iR
                            
                            DstARR = Fu_SortARR(DstARR, 3)
                            
                            For iR = LBound(DstARR) To UBound(DstARR)
                                DstARR(iR, 4) = 0
                                For iRt = LBound(DstARR) To iR
                                    DstARR(iR, 4) = DstARR(iR, 4) + DstARR(iRt, 3)
                                Next iRt
                                DstARR(iR, 4) = Round(DstARR(iR, 4), 4)
                            Next iR
                            
                        Exit For
                    End If
            Next j
        Next i
    End If
    
Else
    End
End If

'***** Selecting if return value are written directly to range or given as Function Return value ***********
If Not DstRNG Is Nothing Then
    DstRNG.Resize(FillRows, 1).value = FillARR
    Fu_Fill_RNG = True
Else
    Fu_Fill_RNG = FillARR
End If

' Test and Debug value
'Taul3.Range("L2").Resize(UBound(TestARR) - 1, 1).value = TestARR

ErrHand:
Application.Calculation = xlAutomatic: Application.ScreenUpdating = True: Application.DisplayAlerts = True
If Err.Number <> 0 Then MsgBox "Something went badly wrong!" & vbCrLf & "Function Fu_Fill_RNG FAILED!!!" & vbCrLf & "VBA-code is ended!" & vbCrLf & vbCrLf & "Error number: " & Err.Number & " " & Err.Description: End
End Function

Sub Use_Fu_Fill_RNG()
Dim x As Variant, x2 As Variant, FillRows As Long, SourceRNG As Range, DestinationRNG As Range
Dim coT As Single: coT = Timer() ' Timer start
Dim TEST As Variant
'********** INSTRUCTIONS *************************************************************************************
'*  If DestinationRNG is not defined, function "Fu_Fill_RNG" return ARRAY of values to x !
'*  If DestinationRNG is defined, function "Fu_Fill_RNG" write values DIRECTLY to DestinationRNG.
'*        Return value to x is then Type: Boolean True/False (True=No Errors, False = Errors function FAILED!
'* If SourceRNG is 1 Column then all VALUES ARE SELECTED RANDOMLY with EQUAL POSIBILITIES _
'*      if SourceRNG is 2 Column, Second Column must contain NUMBER OF POSIBILITIES (how many of each value there are) _
'*      Function count posibilities by NUMBER OF POSIBILITIES (The dice have 6 "Value" and each have 1 "NUMBER OF POSSIBILITIES")
'*  Dice "Col A" "Col B"
'*          1       1
'*          2       1
'*          3       1
'*          4       1
'*          5       1
'*          6       1
'**************************************************************************************************************

' Example data set start
Dim tmpARR(1 To 5, 1 To 2) As Long
tmpARR(1, 1) = 1: tmpARR(1, 2) = 2 ' 2/17
tmpARR(2, 1) = 2: tmpARR(2, 2) = 5 ' 5/17
tmpARR(3, 1) = 3: tmpARR(3, 2) = 3 ' 3/17
tmpARR(4, 1) = 4: tmpARR(4, 2) = 1 ' 1/17
tmpARR(5, 1) = 5: tmpARR(5, 2) = 6 ' 6/17
If Worksheets(1).Range("A2") = "" Then
    Worksheets(1).Range("A2:B6").Value2 = tmpARR
End If
' Example data set end

'********** PARAMETERS FOR "Fu_Fill_RNG" FILL WANTED VALUES HERE! *********************************************
FillRows = 100                                          ' How many Cells(rows) to Fill? (How many draws)
Set SourceRNG = Worksheets(1).Range("A2:B6")            ' Where Found Values to Draw? (Col A= random numbers, Col B = weights (How many times each number is there?) (count of each number/count of all numbers = weights like 27/147)
Set DestinationRNG = Worksheets(1).Range("F1")          ' What Column will be Filled? Only First Cell address
'********** PARAMETERS END! ***********************************************************************************


x = Fu_Fill_RNG(SourceRNG, FillRows, DestinationRNG, False)
Debug.Print Format(Timer() - coT, "#0.00", 2)
End Sub
 
Upvote 0
Hi,

I'm sorry that I'm still bothering with the same thing, but I checked out
your link "The Alias Method for Fast Weighted Random" and as far as I understand, the functions I posted
do "same" what the site presents.

You are not bothering me.

My implementation is now 99% operational. I just have to fix a couple of details to make it work with both integer and floating point weights.

I will post my code and some test results when it's ready.

Now that I have working code, I don't think I will have time to go through your version to understand it. Thanks, anyway.
 
Upvote 0
My apologies to listeners of this thread for previous uncalled for posts I made in this thread. Admin has pointed them out to me & I totally agree.
 
Upvote 0
Jennifer,

There is a method called the Alias Method that works with non-integer weights. The Alias Method works by dividing the weights into two categories; the "good" weights and the "bad" weights. The "good" weights are those that are over the average weight, and the "bad" weights are those that are under the average weight. The Alias Method then creates a two-dimensional array. The first row of the array contains the "good" weights and the second row contains the "bad" weights. The array is then filled such that each row adds up to the same total weight. To generate a weighted random number using the Alias Method, you pick a random row and then randomly select a number from that row. This ensures that each weight is equally likely to be selected. This method works well with large numbers of non-integer weights, as it requires only two rows of weights and can be used to generate a random number in constant time.

Let’s say you have the following weights: 0.9 0.7 0.5 0.2 0.1 First, you need to calculate the average weight, which is 0.5 in this case. Then, you create a two-dimensional array with two rows. The first row contains the "good" weights, which are those that are above the average, and the second row contains the "bad" weights, which are those that are below the average. Good Weights: 0.9 0.7 Bad Weights: 0.5 0.2 0.1 Then, you fill each row such that the sum of the weights in the row equals the total weight. In this case, the total weight is 2.5, so the first row would be filled with 0.9 and 0.7, and the second row would be filled with 0.5, 0.2, and 0.1. Now, to generate a weighted random number, you randomly select a row and then randomly select an element from that row. For example, if you randomly select the second row, you would randomly select either 0.5, 0.2, or 0.

I hope this is helpful!
Hi,

I'm marking this as the solution. I'm still working on the exact implementation, but it is working beautifully.

I didn't quite understand your implementation, so I used the one on Peter Stefek's website.

In my implementation, there are two steps: a setup step and a selection step. The setup step creates the three tables each the same size as the number of weights (N): Low, High, & Threshold. Once created, any number of random elements can be selected without another setup step.

The selection step is super fast. It's just 2 random numbers. The first one is a random integer on [1,N] to select an element in the Threshold table. The second is a random real number on [0,1) to select the Low or High element.

I will post my code when it is fully operational.

Thank you.
 
Upvote 0
@JenniferMurphy, the following is what I have come up with based on the minimal display you have offered previously in this thread.

JenniferMurphyWeightedDrawings.xlsm
ABCDEFGHIJK
1
2
340Number of results
4
512345
6Weights0.10.30.50.70.92.5
7Multiplier10
8Weights as Integers1357925
9% Expected4.0%12.0%20.0%28.0%36.0%100.0%
10StringsTally
11Actual % Received3 5 5 2 2 5 5 5 2 5 4 5 1 3 5 4 4 3 5 5 2 4 2 5 5 5 5 1 5 4 2 5 4 1 4 5 4 4 4 53 6 3 10 18
12Actual % Received3 1 5 5 5 3 2 4 1 4 4 5 2 4 5 4 3 3 1 4 5 5 5 5 2 4 1 5 4 4 4 4 4 5 5 4 5 5 4 34 3 5 14 14
13Actual % Received3 5 4 4 5 5 2 4 5 3 5 2 5 4 3 4 4 5 4 5 2 5 4 5 2 5 4 4 4 5 5 5 5 5 5 5 3 2 5 40 5 4 12 19
14Actual % Received3 5 2 5 5 4 5 3 4 4 1 3 4 5 3 3 2 4 4 4 5 5 3 5 2 3 2 4 4 4 4 4 3 2 4 5 4 3 5 31 5 10 14 10
15Actual % Received5 4 1 4 5 5 2 5 5 2 5 3 5 4 2 4 4 4 5 5 5 5 5 1 5 5 4 5 5 5 3 5 5 3 2 4 5 5 4 52 4 3 9 22
16Actual % Received4 2 2 5 5 5 5 4 4 4 5 5 3 4 5 5 4 5 3 5 4 3 1 5 5 3 5 5 3 4 5 5 5 5 5 5 5 3 4 51 2 6 9 22
17Actual % Received5 4 2 3 4 5 5 3 3 5 3 2 5 4 4 5 5 5 3 3 5 1 2 2 2 4 5 3 3 5 5 5 5 5 2 2 4 2 5 51 8 8 6 17
18Actual % Received5 5 1 4 2 3 4 2 5 3 4 5 4 5 5 5 5 5 3 4 4 5 3 3 2 5 5 2 4 4 5 4 5 4 4 5 2 5 5 12 5 5 11 17
19Actual % Received4 3 2 4 1 5 2 5 3 3 5 2 5 2 5 5 2 4 4 5 3 5 5 4 4 4 5 3 4 4 3 3 5 3 4 5 5 5 4 51 5 8 11 15
20Actual % Received3 2 5 4 3 3 5 2 3 5 5 4 5 4 3 4 2 5 3 5 4 2 5 5 5 5 5 4 5 1 4 4 3 5 2 5 4 5 1 22 6 7 9 16
21
Sheet2
Cell Formulas
RangeFormula
H6,H8:H9H6=SUM(C6:G6)
C8C8=C6*C7
D8D8=D6*C7
E8E8=E6*C7
F8F8=F6*C7
G8G8=G6*C7
C9C9=C6/SUM(C6:G6)
D9D9=D6/SUM(C6:G6)
E9E9=E6/SUM(C6:G6)
F9F9=F6/SUM(C6:G6)
G9G9=G6/SUM(C6:G6)


If you, or anyone else, are interested in pursuing that approach, let me know. I will gladly post the macro code that goes with it.
 
Upvote 0
@JenniferMurphy, the following is what I have come up with based on the minimal display you have offered previously in this thread.

If you, or anyone else, are interested in pursuing that approach, let me know. I will gladly post the macro code that goes with it.
As I have said a few times, I have working code. That's why I marked this thread solved. I am just looking at a few tweaks to make it simpler and more general.

I am not sure why you are converting the values to integers. That might work with these sample values, but the actual data could involve upwards of 100 values and they could be long decimal values, such as 77/83 = 0.927710843...

But if you want to post your code, I'll take a look.
 
Upvote 0
It just seems weird that you would offer up some data in post #1, but now you say that data should not be used to help you? Sorry, I am confused.
 
Upvote 0
I am all ears for a sample data, I have said that from the beginning.
OK. Here's some sample data. It is similar to what my code will have to deal with. I've used letters (A-T) to identify the values. This is only 20 values. The actual data could be several times larger.

The objective is to generate a list of 40-50 instances of these "letters" such that they occur in proportion to the weights. For example, A (weight = 0.0755) should occur about 4.2264 times as often as B (weight = 0.0179). And all letters should occur roughly in proportion to their probability as shows in row 10.

Sorting.xlsx
BCDEFGHIJKLMNOPQRSTUV
8ABCDEFGHIJKLMNOPQRSTSum
90.07550.01790.22220.01540.15380.15380.02220.01540.20000.01110.02220.02940.04940.00720.01180.01530.00830.00600.03900.00521.0811
106.981%1.652%20.555%1.423%14.230%14.230%2.055%1.423%18.499%1.028%2.055%2.721%4.568%0.665%1.088%1.412%0.771%0.557%3.604%0.482%100.00%
Sheet3
Cell Formulas
RangeFormula
V9:V10V9=SUM(B9:U9)
B10:U10B10=B9/$V$9
 
Upvote 0

Forum statistics

Threads
1,215,061
Messages
6,122,921
Members
449,094
Latest member
teemeren

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top