Selecting some rows, on value and randomly

TheRedCardinal

Board Regular
Joined
Jul 11, 2019
Messages
103
Hi all,

I have a table that consists of varying columns, and rows. Its exact format varies with each run, but it will have 2 columns at the end of the table, called "Adjusted Value" and Comments. The number of rows varies.

There are lines in the workbook beyond the bottom of the table that are not part of the data.

I tried to write a sub that does the following:

  1. Finds the 5 highest values in the Adjusted Value Tab
  2. Picks other lines randomly from the table until the % of values "selected" exceeds 20% of the total value in that column
  3. In the cell next to the value, enter the text "This row has been selected for checking"

I designed this code (only meant for finding the highest 5 values):

Code:
Sub RandomChecks()


Dim CheckText As String
Dim ValueRange As Range, ValueColumn As Range
Set WS1 = Sheets("2. Final Data")


With WS1
    
    Set ValueColumn = .Range("A1:Z1").Find("Adjusted Value")
    CheckText = "This line has been selected for checking"
    LRow = Range("A1").End(xlDown).Row
    
    Set ValueRange = WS1.Range(Cells(2, ValueColumn.Column), Cells(LRow, ValueColumn.Column))
    
    If LRow > 5 Then
        Counter = 5
    Else
        Counter = LRow
    End If
        
    For MyTemp = 1 To Counter
        
        For Each CellA In ValueRange
        
            If CellA.Value = Application.WorksheetFunction.Large(ValueRange, MyTemp) Then
            CellA.Offset(0, 1).Value = CheckText
            End If
                
        Next CellA
        
    Next MyTemp
    
    End With
    
End Sub
So as I'm here, it obviously isn't working quite right!

When I ran the macro, I got values 1 and 2 fine but it turns out there are about 11 different lines that have value 3 in them, and they were ALL populated with the text string.
Further, items 4 and 5 did not then get posted, because I suspect that the sheet classes the 4th highest discrete value as actually being the 14th highest value.

So my questions are:

  1. Is it possible to code this so that once a value is found, it is ignore in the next loop
  2. How do I get the random selection element of this to work, including an instruction that if it has been selected in the top 5 process, it should be excluded
  3. Any comments on how to code this better?

I was thinking of sorting the column by Adjusted Value and using top 5 rows but that gives me the same issue as (1) above.

Thanks in advance!
 

shknbk2

Board Regular
Joined
Mar 5, 2016
Messages
235
I was thinking of sorting the column by Adjusted Value and using top 5 rows
That gave me an idea to come up with the code below. I used great information for ArrayLists including the Arr2DToArrayList function from HERE.

I create the ArrayList from your ValueRange and then Sort and Reverse the sorting. The first For loop then removes duplicates, and the code then marks the adjacent cells with the CheckText.

The code doesn't complete your request, though. I wasn't sure exactly what you meant by:
Picks other lines randomly from the table until the % of values "selected" exceeds 20% of the total value in that column
What total value are you talking about? Does this mean that you want the top 5 items plus any random items so that the total selected > 20% of the total number of all items? of the remaining items after removing duplicates? of the sum of the numbers in the column? Please give some examples.

The code right now leaves 'coll' with all of the unique items, so it needs to be paired down based on your answer above. However, for now, you can check its current functionality to select the top 5 items and to mark them.

By the way, I assumed that if there are multiple cells having one of the top 5 items, that you only wanted one of them marked. If this is not the case, please also say so.

Code:
Sub RandomChecks()
    Dim CheckText As String, Counter As Integer
    Dim ValueRange As Range, ValueColumn As Range, findCell As Range
    Dim coll As Object, i As Long
    Set WS1 = Sheets("2. Final Data")

    With WS1
        Set ValueColumn = .Range("A1:Z1").Find("Adjusted Value")
        CheckText = "This line has been selected for checking"
        LRow = Range("A1").End(xlDown).Row
        Set ValueRange = WS1.Range(Cells(2, ValueColumn.Column), Cells(LRow, ValueColumn.Column))
        Set coll = Arr2DToArrayList(ValueRange.Value)
        
        coll.Sort
        coll.Reverse
        For i = coll.Count - 1 To 1 Step -1
            If coll(i) = coll(i - 1) Then
                coll.RemoveAt i
            End If
        Next i
        If coll.Count > 4 Then
            Counter = 5
        Else
            Counter = coll.Count
        End If
        For i = 0 To Counter - 1
            Set findCell = ValueRange.Find(coll(i))
            findCell.Offset(0, 1).Value = CheckText
        Next i
    End With
End Sub

Function Arr2DToArrayList(arr As Variant) As Object

    ' Check for 2D array
    If UBound(arr, 2) > 1 Then
        Err.Raise vbObjectError + 513, "RangeToArrayList" _
            , "The range/array can only have one column"
    End If
    
    ' Create the array list
    Dim coll As Object
    Set coll = CreateObject("System.Collections.ArrayList")
    
    ' Add items from array to ArrayList
    Dim i As Long
    For i = LBound(arr, 1) To UBound(arr, 1)
        coll.Add arr(i, 1)
    Next i
    
    ' Return new ArrayList
    Set Arr2DToArrayList = coll
    
End Function
 

TheRedCardinal

Board Regular
Joined
Jul 11, 2019
Messages
103
Ah, the scary array! That's a really interesting solution but I'll need to work through it so I can understand it and test it.

For the remainder of the query I'm on my phone so can't give data but yes you've got it pretty much summarised.

If the total value (Sum) of the Adjusted Value Column is X, then I need to check enough lines to cover 20% of X. If that 20% is covered by the top 5 value invoices then I need to pick 3 random lines (this is a new requirement based on my analysis of the data). If it doesn't cover that 20% I need to keep picking lines until the number selected comes to 20% or above, of at least 3 new lines.

On the multiple values thing, yes I'd like the highest 5 unique values.

Thanks for your help!
 

TheRedCardinal

Board Regular
Joined
Jul 11, 2019
Messages
103
So the selection of the top 5 works perfectly. Thank you for that.

I was wondering how, once it has been sorted, you would be able to find the original cells again. But from my understanding, this is fixed by removing duplicates and then using a Find search to locate the cell in the original range. Very clever!

So now the next steps would be:

- total up the value of lines "checked" - I think I could add this into the existing routine by Dimming CheckedValue as Long and then using
Code:
CheckedValue = CheckedValue + Findcell.Value
- if CheckedValue is more than 20% of the total value of the range, stop there

- if not, randomly select lines from the remaining rows, adding the same value to the (0,1) offset, and adding values to CheckedValue until it exceeds 20%.

I guess this would be a new function using the same Object list that was defined before?
 

shknbk2

Board Regular
Joined
Mar 5, 2016
Messages
235
Your suggestions were right along where I was thinking.

To randomize the remaining items in 'coll', i create 2 arrays: aRnd is filled with random numbers, and aItems is filled from 5 to the number of items in coll. Then, when numerically sorting aRnd together with making the same changes to aItems, aItems ends up randomized as well. From there, starting at the beginning of aItems and running through the array until 20% has been found, random selections are made.

At the end, I sort aSelectedValues in case it needs it, but I'm not sure if you will use those values for more processing or not. I don't know if you need to keep the used values in the code for further processing or if you will only rely on the indicated cells in the Worksheet. If you need them still in code, aSelectedValues contains all used values at the end of the code.

Code:
Sub RandomChecks()
    Dim CheckText As String, Counter As Integer
    Dim ValueRange As Range, ValueColumn As Range, findCell As Range
    Dim coll As Object, i As Long
    Dim totalSum As Double, runningSum As Double, d As Double
    Dim aRnd() As Double, aItems() As Integer, aSelectedValues() As Double
    Dim sorted As Boolean
    
    Set WS1 = Sheets("2. Final Data")

    With WS1
        Set ValueColumn = .Range("A1:Z1").Find("Adjusted Value")
        CheckText = "This line has been selected for checking"
        LRow = Range("A1").End(xlDown).Row
        Set ValueRange = WS1.Range(Cells(2, ValueColumn.Column), Cells(LRow, ValueColumn.Column))
        ValueRange.Offset(0, 1).ClearContents
        Set coll = Arr2DToArrayList(ValueRange.Value)
        totalSum = Application.WorksheetFunction.Sum(ValueRange.Value)
        
        coll.Sort
        coll.Reverse
        For i = coll.Count - 1 To 1 Step -1
            If coll(i) = coll(i - 1) Then
                coll.RemoveAt i
            End If
        Next i
        If coll.Count > 4 Then
            Counter = 5
        Else
            Counter = coll.Count
        End If
        ReDim aSelectedValues(Counter - 1)
        For i = 0 To Counter - 1
            Set findCell = ValueRange.Find(coll(i))
            findCell.Offset(0, 1).Value = CheckText
            runningSum = runningSum + coll(i)
            aSelectedValues(i) = coll(i)
        Next i
        
        'If the running sum is less than 20% of the sum of all the values, start adding random
        If runningSum < totalSum * 0.2 And Counter = 5 Then
            'aRnd is filled with random numbers and sorted
            'aItems is filled starting with 5
            ReDim aRnd(coll.Count - Counter - 1)
            ReDim aItems(coll.Count - Counter - 1)
            Randomize
            For i = 0 To UBound(aRnd)
                aRnd(i) = Rnd()
                aItems(i) = i + Counter
            Next i
            
            'both are sorted at the same time so that when aRnd is sorted
            'numerically, aItems is also sorted in the same way
            sorted = False
            Do While Not sorted
                sorted = True
                For i = 0 To UBound(aRnd) - 1
                    If aRnd(i) < aRnd(i + 1) Then
                        d = aRnd(i)
                        aRnd(i) = aRnd(i + 1)
                        aRnd(i + 1) = d
                        d = aItems(i)
                        aItems(i) = aItems(i + 1)
                        aItems(i + 1) = d
                        sorted = False
                    End If
                Next i
            Loop
            
            'add the randomized aItem number from coll until runningSum is 20% of the total
            i = -1
            Do While runningSum < totalSum * 0.2 And i <= UBound(aItems)
                i = i + 1
                runningSum = runningSum + coll(aItems(i))
                Set findCell = ValueRange.Find(coll(aItems(i)), , , xlWhole)
                findCell.Offset(0, 1).Value = CheckText
                ReDim Preserve aSelectedValues(Counter + i)
                aSelectedValues(Counter + i) = coll(aItems(i))
            Loop
        End If
    End With
    
    're-sort aSelectedValues
    sorted = False
    Do While Not sorted
        sorted = True
        For i = 0 To UBound(aSelectedValues) - 1
            If aSelectedValues(i) < aSelectedValues(i + 1) Then
                d = aSelectedValues(i)
                aSelectedValues(i) = aSelectedValues(i + 1)
                aSelectedValues(i + 1) = d
                sorted = False
            End If
        Next i
    Loop
End Sub

Function Arr2DToArrayList(arr As Variant) As Object
    ' Check for 2D array
    If UBound(arr, 2) > 1 Then
        Err.Raise vbObjectError + 513, "RangeToArrayList" _
            , "The range/array can only have one column"
    End If
    
    ' Create the array list
    Dim coll As Object
    Set coll = CreateObject("System.Collections.ArrayList")
    
    ' Add items from array to ArrayList
    Dim i As Long
    For i = LBound(arr, 1) To UBound(arr, 1)
        coll.Add arr(i, 1)
    Next i
    
    ' Return new ArrayList
    Set Arr2DToArrayList = coll
    
End Function
 

Forum statistics

Threads
1,086,065
Messages
5,387,585
Members
402,070
Latest member
hyperf0

Some videos you may like

This Week's Hot Topics

Top