Need to tweak working algorithm to run faster

drmingle

Board Regular
Joined
Oct 5, 2009
Messages
229
Looking for a faster performance...of Damon Ostrander's code.

str1 is limited to 24 characters max since the algorithm takes a bit of time (it's exhaustive in it's comparison against ordered combinations).

I need help with augmenting the code to reflect if a match is found that it does not need to carry out the process of checking other characters on the character of str1, it can instead move on to the next character).Thanks for your help.
Code:
Dim TopMatch         As Integer
Dim strCompare       As String
Function Fuzzy(strIn1 As String, strIn2 As String) As Single

   Dim L1               As Integer
   Dim In1Mask(1 To 24) As Long     'strIn1 is 24 characters max
   Dim iCh              As Integer
   Dim N                As Long
   Dim strTry           As String
   Dim strTest          As String
   
   TopMatch = 0
   L1 = Len(strIn1)
   strTest = UCase(strIn1)
   strCompare = UCase(strIn2)
   
   For iCh = 1 To L1
      In1Mask(iCh) = 2 ^ iCh
   Next iCh
   
   'Loop thru all ordered combinations of characters in strIn1
   For N = 2 ^ (L1 + 1) - 1 To 1 Step -1
      strTry = ""
      For iCh = 1 To L1
         If In1Mask(iCh) And N Then
            strTry = strTry & Mid(strTest, iCh, 1)
         End If
      Next iCh
      If Len(strTry) > TopMatch Then TestString strTry
   Next N
   
   Fuzzy = TopMatch / CSng(L1)
   
End Function

Sub TestString(strIn As String)

   Dim L          As Integer
   Dim strTry   As String
   Dim iCh        As Integer
   
   L = Len(strIn)
   If L <= TopMatch Then Exit Sub
   
   strTry = "*"
   
   For iCh = 1 To L
      strTry = strTry & Mid(strIn, iCh, 1) & "*"
   Next iCh
   
   If strCompare Like strTry Then
      If L > TopMatch Then TopMatch = L
   End If
   
End Sub
 

Excel Facts

Get help while writing formula
Click the italics "fx" icon to the left of the formula bar to open the Functions Arguments dialog. Help is displayed for each argument.
I don't know so much about the algorithm,
but I can offer some advices to run faster from the point of view of Visual Basic's technical knowledge.

1) Avoid string-concatenation with '&'.
'&' causes memory reallocation which takes so much time.
Instead, calculate the total length of the concatenated string, and
allocate the string with calculated length and replace characters
one by one.
2) Use String-Functions with '$'.
Mid returns variant variable.
This causes internal variant-string conversion.
Mid$ returns string variable.

I tried to modify the function.
Here's my attempt and it's result.
Code:
Private Declare Function GetTickCount Lib "kernel32" () As Long
Dim TopMatch         As Integer
Dim strCompare       As String
Sub Main()
 
    Dim s1 As String
    Dim s2 As String
    Dim L1 As Long
 
    s1 = String(20, "a")
    s2 = String(10, "a")
 
    Debug.Print "The original function returns ";
    L1 = GetTickCount
    Debug.Print Fuzzy(s1, s2);
    L1 = GetTickCount - L1
    Debug.Print " in " & L1 & " milliseconds."
 
    Debug.Print "The modified function returns ";
    L1 = GetTickCount
    Debug.Print NewFuzzy(s1, s2);
    L1 = GetTickCount - L1
    Debug.Print " in " & L1 & " milliseconds."
 
 
End Sub
Function NewFuzzy(strIn1 As String, strIn2 As String) As Single
   Dim L1               As Integer
   Dim In1Mask(1 To 24) As Long     'strIn1 is 24 characters max
   Dim iCh              As Integer
   Dim N                As Long
   Dim strTry           As String
   Dim strTest          As String
   Dim lRealLen_strTry As Long
 
   TopMatch = 0
   L1 = Len(strIn1)
   strTest = UCase$(strIn1)
   strCompare = UCase$(strIn2)
 
   For iCh = 1 To L1
      In1Mask(iCh) = 2 ^ iCh
   Next iCh
 
   'Loop thru all ordered combinations of characters in strIn1
   For N = 2 ^ (L1 + 1) - 1 To 1 Step -1
      lRealLen_strTry = 0
      For iCh = 1 To L1
         If In1Mask(iCh) And N Then
            lRealLen_strTry = lRealLen_strTry + 1
         End If
      Next iCh
      If lRealLen_strTry > TopMatch Then
        strTry = Space$(lRealLen_strTry)
        lRealLen_strTry = 0
        For iCh = 1 To L1
           If In1Mask(iCh) And N Then
              'strTry = strTry & Mid(strTest, iCh, 1)
              lRealLen_strTry = lRealLen_strTry + 1
              Mid$(strTry, lRealLen_strTry, 1) = Mid$(strTest, iCh, 1)
           End If
        Next iCh
      End If
      'If Len(strTry) > TopMatch Then TestString strTry
      If lRealLen_strTry > TopMatch Then NewTestString strTry
   Next N
 
   NewFuzzy = TopMatch / CSng(L1)
 
End Function
Sub NewTestString(strIn As String)
   Dim L          As Integer
   Dim strTry   As String
   Dim iCh        As Integer
 
   L = Len(strIn)
   If L <= TopMatch Then Exit Sub
 
   'strTry = "*"
   strTry = "*" & String$(L * 2, "*")
   For iCh = 1 To L
      'strTry = strTry & Mid(strIn, iCh, 1) & "*"
      Mid$(strTry, iCh * 2, 1) = Mid$(strIn, iCh, 1)
   Next iCh
 
   If strCompare Like strTry Then
      If L > TopMatch Then TopMatch = L
   End If
 
End Sub
Function Fuzzy(strIn1 As String, strIn2 As String) As Single
   Dim L1               As Integer
   Dim In1Mask(1 To 24) As Long     'strIn1 is 24 characters max
   Dim iCh              As Integer
   Dim N                As Long
   Dim strTry           As String
   Dim strTest          As String
 
   TopMatch = 0
   L1 = Len(strIn1)
   strTest = UCase(strIn1)
   strCompare = UCase(strIn2)
 
   For iCh = 1 To L1
      In1Mask(iCh) = 2 ^ iCh
   Next iCh
 
   'Loop thru all ordered combinations of characters in strIn1
   For N = 2 ^ (L1 + 1) - 1 To 1 Step -1
      strTry = ""
      For iCh = 1 To L1
         If In1Mask(iCh) And N Then
            strTry = strTry & Mid(strTest, iCh, 1)
         End If
      Next iCh
      If Len(strTry) > TopMatch Then TestString strTry
   Next N
 
   Fuzzy = TopMatch / CSng(L1)
 
End Function
 
Sub TestString(strIn As String)
   Dim L          As Integer
   Dim strTry   As String
   Dim iCh        As Integer
 
   L = Len(strIn)
   If L <= TopMatch Then Exit Sub
 
   strTry = "*"
 
   For iCh = 1 To L
      strTry = strTry & Mid(strIn, iCh, 1) & "*"
   Next iCh
 
   If strCompare Like strTry Then
      If L > TopMatch Then TopMatch = L
   End If
 
End Sub
In immediate window:
The original function returns 0.5 in 34718 milliseconds.
The modified function returns 0.5 in 16407 milliseconds.
 
Upvote 0
Wow, you advanced coders just blow my mind.
If I put my understanding of that algorithm in purple, will you explain where I am wrong?

Rich (BB code):
Where "In1Mask" holds 24 numbers by powers of 2, from 2 to 16777216, and, "TopMatch" = 0
 
'Loop thru all ordered combinations of characters in strIn1
For N = 2 ^ (L1 + 1) - 1 To 1 Step -1
Where "N" is all numbers from 1 to 16777215
   strTry = ""
   For iCh = 1 To L1
      If In1Mask(iCh) And N Then
      If positive number = -1 and positive numer = -1. This one truly stumps me.
         strTry = strTry & Mid(strTest, iCh, 1)
         strTry will be one of 24 characters
      End If
   Next iCh
If Len(strTry) > TopMatch Then TestString strTry
If 1 > 0 then...
Next N
 
Fuzzy = TopMatch / CSng(L1)
Fuzzy = 0 / (1 to 24)
 
Last edited:
Upvote 0
I have thought about the algorithm.
When str1 is 'abbbbbbbbc' and str2 is 'acdefg',
the charater 'b' is not contained by str2.
So , you don't have to check abc,abbc,abbbc,abbbbc and so on.
Check which digit of str1 is worth testing before the For Loop,
And avoid the process about unused characters.
Because the code is a little bit long,
I post the code dividing into 4 parts as the following.
Code:
Private Declare Function GetTickCount Lib "kernel32" () As Long
Dim TopMatch         As Integer
Dim strCompare       As String
Sub Main()
 
    Dim s1 As String
    Dim s2 As String
    Dim L1 As Long
 
    s1 = "a" & String(15, "b") & "c"
    s2 = String(17, "a")
 
    Debug.Print "The original function returns ";
    L1 = GetTickCount
    Debug.Print Fuzzy(s1, s2);
    L1 = GetTickCount - L1
    Debug.Print " in " & L1 & " milliseconds."
 
    Debug.Print "The modified function returns ";
    L1 = GetTickCount
    Debug.Print NewFuzzy(s1, s2);
    L1 = GetTickCount - L1
    Debug.Print " in " & L1 & " milliseconds." 
End Sub
Code:
Function NewFuzzy(strIn1 As String, strIn2 As String) As Single
   Dim L1               As Integer
   Dim In1Mask(1 To 24) As Long     'strIn1 is 24 characters max
 
   '''''''''''' Check which digit is not worth evaluating.
   Dim DonotTestDigitMask As Long
 
   Dim iCh              As Integer
   Dim N                As Long
   Dim strTry           As String
   Dim strTest          As String
   Dim lRealLen_strTry As Long
 
   TopMatch = 0
   L1 = Len(strIn1)
   strTest = UCase$(strIn1)
   strCompare = UCase$(strIn2)
 
   For iCh = 1 To L1
      In1Mask(iCh) = 2 ^ iCh
   Next iCh
 
   'Check which digit is not contained.
   DonotTestDigitMask = 0
   For iCh = 1 To L1
      If strCompare Like "*" & Mid$(strTest, iCh, 1) & "*" Then
      Else
          DonotTestDigitMask = DonotTestDigitMask Or 2 ^ iCh
      End If
   Next iCh
 
 
   'Loop thru all ordered combinations of characters in strIn1
   For N = 2 ^ (L1 + 1) - 1 To 1 Step -1
      lRealLen_strTry = 0
      For iCh = 1 To L1
         If In1Mask(iCh) And N Then
            lRealLen_strTry = lRealLen_strTry + 1
            'Avoid the process about unused characters.
            If In1Mask(iCh) And DonotTestDigitMask Then GoTo NextTest
         End If
 
 
      Next iCh
      If lRealLen_strTry > TopMatch Then
        strTry = Space$(lRealLen_strTry)
        lRealLen_strTry = 0
        For iCh = 1 To L1
           If In1Mask(iCh) And N Then
              'strTry = strTry & Mid(strTest, iCh, 1)
              lRealLen_strTry = lRealLen_strTry + 1
              Mid$(strTry, lRealLen_strTry, 1) = Mid$(strTest, iCh, 1)
           End If
        Next iCh
      End If
      'If Len(strTry) > TopMatch Then TestString strTry
      If lRealLen_strTry > TopMatch Then NewTestString strTry
NextTest:
   Next N
 
   NewFuzzy = TopMatch / CSng(L1)
 
End Function
Code:
Sub NewTestString(strIn As String)
   Dim L          As Integer
   Dim strTry   As String
   Dim iCh        As Integer
 
   L = Len(strIn)
   If L <= TopMatch Then Exit Sub
 
   'strTry = "*"
   strTry = "*" & String$(L * 2, "*")
   For iCh = 1 To L
      'strTry = strTry & Mid(strIn, iCh, 1) & "*"
      Mid$(strTry, iCh * 2, 1) = Mid$(strIn, iCh, 1)
   Next iCh
 
   If strCompare Like strTry Then
      If L > TopMatch Then TopMatch = L
   End If
 
End Sub
Code:
Function Fuzzy(strIn1 As String, strIn2 As String) As Single
   Dim L1               As Integer
   Dim In1Mask(1 To 24) As Long     'strIn1 is 24 characters max
   Dim iCh              As Integer
   Dim N                As Long
   Dim strTry           As String
   Dim strTest          As String
 
   TopMatch = 0
   L1 = Len(strIn1)
   strTest = UCase(strIn1)
   strCompare = UCase(strIn2)
 
   For iCh = 1 To L1
      In1Mask(iCh) = 2 ^ iCh
   Next iCh
 
   'Loop thru all ordered combinations of characters in strIn1
   For N = 2 ^ (L1 + 1) - 1 To 1 Step -1
      strTry = ""
      For iCh = 1 To L1
         If In1Mask(iCh) And N Then
            strTry = strTry & Mid(strTest, iCh, 1)
         End If
      Next iCh
      If Len(strTry) > TopMatch Then TestString strTry
   Next N
 
   Fuzzy = TopMatch / CSng(L1)
 
End Function
 
Sub TestString(strIn As String)
   Dim L          As Integer
   Dim strTry   As String
   Dim iCh        As Integer
 
   L = Len(strIn)
   If L <= TopMatch Then Exit Sub
 
   strTry = "*"
 
   For iCh = 1 To L
      strTry = strTry & Mid(strIn, iCh, 1) & "*"
   Next iCh
 
   If strCompare Like strTry Then
      If L > TopMatch Then TopMatch = L
   End If
 
End Sub
The original function returns 5.882353E-02 in 11953 milliseconds.
The modified function returns 5.882353E-02 in 328 milliseconds.
 
Upvote 0
Dr Mingle

Perhaps it would help us help you if you gave us a clue as to what the code is actually meant to do.

A link to wherever you got it might help too.:)
 
Upvote 0

Forum statistics

Threads
1,214,649
Messages
6,120,732
Members
448,987
Latest member
marion_davis

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top