Optimise VBA to search for whole words

Deutz

Board Regular
Joined
Nov 30, 2009
Messages
191
Office Version
  1. 365
Platform
  1. Windows
Hi and thanks in advance,
I am using Excel 2010.

I have a data sheet with about 100,000 rows and 30 columns. There may be up to 200,000 rows in the future. I have written some VBA that searches 12 of those 30 columns for a whole word match with one of 350 separate words which I have listed on another sheet. If a match is found in one of those 12 columns then the whole row of data is copied to a results sheet and the code moves to the next row and repeats. I am using regex to replace non alpha numeric characters with a space and then searching for a whole word delimited by spaces. This seems to work ok but takes a very long time to run. For 100,000 rows it is taking around 16 minutes. Wondering if there is a much quicker way to do the same thing?

Code:
Sub TestIt()
    Dim arrData As Variant
    Dim arrWords As Variant
    
    arrData = Sheets("Data").Range("A2:AD100000")
    arrWords = Sheets("WordList").Range("A2:A350")
    GetMatchingRows arrData, arrWords
End Sub

Sub GetMatchingRows(ByVal arrData As Variant, ByVal arrWords As Variant)
    Dim n As Long
    Dim i As Long
    Dim j As Long
    Dim x As Long
    Dim arr() As String
    Dim arrCols() As String
    Dim wsSource As Worksheet
    Dim wsDest As Worksheet
    Dim RegEx As Object
    'Dim StartTime As Double
    'Dim MinutesElapsed As String
    
    Set RegEx = CreateObject("VBScript.RegExp")
   
    'StartTime = Timer
    
    With RegEx
        .Pattern = "[^a-zA-Z0-9]"
        .Global = True
        .MultiLine = True
    End With
    
    Set wsSource = Worksheets("Data")
    Set wsDest = Worksheets("Result")
    
    arr = ("2,4,5,8,9,11,12,14,16,20")
    ' Assign column numbers to array
    arrCols = Split(arr, ",")
    x = 10
    ' Loop data array for each column number
    For i = LBound(arrData) To UBound(arrData) ' Row
        For j = LBound(arrCols) To UBound(arrCols) ' Col
            ' Replace all non alpha numeric characters in the array element with a space
            If RegEx.test(arrData(i, arrCols(j))) Then
                arrData(i, arrCols(j)) = RegEx.Replace(arrData(i, arrCols(j)), " ")
                arrData(i, arrCols(j)) = " " & arrData(i, arrCols(j)) & " "
            End If
            ' Loop words to match on array
            For n = LBound(arrWords) To UBound(arrWords)
                ' Test if whole word found in data array element (space & word & space)
                If InStr(1, UCase(arrData(i, arrCols(j))), " " & UCase(arrWords(n, 1)) & " ") > 0 Then
                    x = x + 1
                    ' If whole word found then add the whole data row to the destination sheet and exit the column
                    ' For loop and continue with the next row For loop
                    wsDest.Range("A" & x & ":AD" & x).Value = wsSource.Range("A" & i + 1 & ":" & "AD" & i + 1).Value
                    GoTo ExitColFor
                End If
            Next n
        Next j
ExitColFor:
    Next i
    wsDest.Range("A7").Value = x - 10
    MsgBox "Done"
    'MinutesElapsed = Format((Timer - StartTime) / 86400, "hh:mm:ss")
    'MsgBox "This code ran successfully in " & MinutesElapsed & " minutes", vbInformation
    Set RegEx = Nothing
End Sub

Thanks
Deutz
 

Excel Facts

Quick Sum
Select a range of cells. The total appears in bottom right of Excel screen. Right-click total to add Max, Min, Count, Average.
Re: How to optimise VBA to search for whole words

Hi Deutz

There are smarter people here than me, but your code to get the data and then apply the regex test seems pretty efficient (you're allocating a range to an array and then interrogating that array, rather than referencing the worksheet each time).

I would suggest if your second If statement is TRUE you write that result to an (expanding) array and then once the loop is finished, you then write the entire array to the worksheet, rather than writing the result to the worksheet every time. ie write the result to the worksheet once, rather than many (thousands of?) times.

[Also get the thing you need to write to this 'Result' array from the array you have created (arrData) and not wsSource (which is a range on the worksheet)]

Cheers

pvr928
 
Last edited:
Upvote 0
Re: How to optimise VBA to search for whole words

Hello,

well, just some ideas:

use the "match-function" in the array with the search word "*Search-Word*". Then there are 350 word * 12 Columns.

regards
 
Upvote 0
Hello,

here a short example:

Code:
Sub Test()
Dim WSF As WorksheetFunction: Set WSF = Application.WorksheetFunction
On Error Resume Next
Su = Range("H1").CurrentRegion ' Range of search-words
Li = Range("A1").CurrentRegion ' List to search
For c = 1 To 3
Li1 = Application.Transpose(Application.Index(Li, 0, c))
    For s = 1 To UBound(Su)
        r = WSF.Match("*" & Su(s, 1) & "*", Li1, 0) 'row found
        If Err.Number = 0 Then Debug.Print r 'better: write row number as key in dictionary, content to row as item
        Err.Clear
    Next s
Next c
End Sub

Feedback would be nice.

regards
 
Upvote 0
Re: How to optimise VBA to search for whole words

Hi pvr928 and Fennek,

Thanks for your suggestions. I'll try implementing them when I get a moment and see if they speed up the process.

Regards
Deutz
 
Upvote 0
Re: How to optimise VBA to search for whole words

Hello,

here is a test-code to simulate according to your question (as I understood), which takes 1 second runtime for 1 mio rows:

Code:
Sub iStart()
Tx = Array("dog", "cat")
Range("H1").Resize(UBound(Tx) + 1) = Application.Transpose(Tx)
For i = 2 To 1000000 Step 50
    Cells(i, 1).Resize(, 2) = Tx
Next i
End Sub
Sub Test()
'ca 1 second
Start = Timer
Dim WSF As WorksheetFunction: Set WSF = Application.WorksheetFunction
Su = Range("H1").CurrentRegion ' Range of search-words
Li = Range("A1:B" & Cells(Rows.Count, 1).End(xlUp).Row) '.CurrentRegion ' List to search
        On Error Resume Next
With CreateObject("scripting.dictionary")
For c = 1 To 3
Li1 = Application.Transpose(Application.Index(Li, 0, c))
    For s = 1 To UBound(Su)
        Do
        r = WSF.Match(Su(s, 1) & Chr(42), Li1, 0) 'row found
        If Err.Number <> 0 Then GoTo iErr
            .Item(s & "|" & r) = Li(r, 1) & "|" & Li(r, 2)
            Li1(r) = ""
            Li(r, 1) = ""
            Li(r, 2) = ""
        Loop
iErr:
        Err.Clear
    Next s
Next c
Cells(1, 4).Resize(.Count, 2) = Application.Transpose(Array(.keys, .items))
End With
Debug.Print Timer - Start
End Sub

regards
 
Upvote 0
Re: How to optimise VBA to search for whole words

Thanks again. Will get to this before too long. Just managing a bit of a crisis on the home front so sorry for the lack of feedback.

Regards
Deutz
 
Upvote 0

Forum statistics

Threads
1,214,919
Messages
6,122,260
Members
449,075
Latest member
staticfluids

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top