![]() |
![]() |
|
|||||||
| Excel Questions All Excel/VBA questions - formulas, macros, pivot tables, general help, etc. Please post to this forum in English only. |
![]() |
|
|
Thread Tools | Display Modes |
|
|
#1 |
|
Join Date: Aug 2002
Posts: 69
|
Hi,
I need a function or macro that helps me in matching manufacturer names from incoming data files with all the values in an internal standard list with valid manufacturer names. After the matching it gives me the internal standard value that comes closest to the incoming value. As an example all the incoming values: Abbott Labs, ABBOTT and Abbott Laboratories A/S must be translated to the internal standard value ABBBOTT LABS. I'm envisaging a function similar to this: =CLOSESTMATCHES(IncomingValue, standardlist) Ex. I write the following function in cell C3: =CLOSESTMATCHES(A2;B2:B8) When the function finds a value in the standardlist that looks like the incoming value it writes the internal standard value in cell C3. If it doesn't find anything it writes f.ex NEW and if there are several matches it writes all the values, either separated by semicolon or in the following cells in the same row. I don't know if there is a native function that will meet my requirements and it's not that I don't know how to write a custom function. My problem is that I don't know how approach the problem of comparing and matching the incoming values with standard values. I've been considdering the function: SEARCH(find_text,within_text,start_num) but I don't feel that it meets my requirements. Thanks. Jon |
|
|
|
|
|
#2 |
|
MrExcel MVP
Join Date: Jul 2002
Posts: 32,030
|
|
|
|
|
|
|
#3 |
|
Join Date: Aug 2002
Posts: 69
|
Thanks! Seems you understood my problem
Jon |
|
|
|
|
|
#4 |
|
Join Date: Jul 2003
Location: Manchester (UK)
Posts: 3,497
|
Hi,
Following on from that, how about my FuzzyVLookup UDF: Code:
Option Explicit
Type RankInfo
Offset As Integer
Percentage As Single
End Type
Function FuzzyPercent(ByVal String1 As String, _
ByVal String2 As String, _
Optional Algorithm As Integer = 3) As Single
'*************************************
'** Return a % match on two strings **
'*************************************
Dim intLen1 As Integer
Dim intCurLen As Integer
Dim intTo As Integer
Dim intPos As Integer
Dim intPtr As Integer
Dim intScore As Integer
Dim intTotScore As Integer
Dim intStartPos As Integer
Dim StrWork As String
Dim Str1 As String
Dim Str2 As String
'---------------------------------------------------
'-- Remove surrounding spaces, ensure lower case --
'-- and remove multiple internal spaces --
'---------------------------------------------------
Str1 = LCase$(Trim$(String1))
Do
StrWork = Str1
Str1 = Application.WorksheetFunction.Substitute _
(StrWork, " ", " ")
Loop Until StrWork = Str1
Str2 = LCase$(Trim$(String2))
Do
StrWork = Str2
Str2 = Application.WorksheetFunction.Substitute _
(StrWork, " ", " ")
Loop Until StrWork = Str2
'----------------------------------------------
'-- Give 100% match if strings exactly equal --
'----------------------------------------------
If Str1 = Str2 Then
FuzzyPercent = 1
Exit Function
End If
intLen1 = Len(Str1)
'----------------------------------------
'-- Give 0% match if string length < 2 --
'----------------------------------------
If intLen1 < 2 Then
FuzzyPercent = 0
Exit Function
End If
intTotScore = 0 'initialise total possible score
intScore = 0 'initialise current score
'--------------------------------------------------------
'-- If Algorithm = 1 or 3, Search for single characters --
'--------------------------------------------------------
If (Algorithm And 1) <> 0 Then
intTotScore = intLen1 'initialise total possible score
intPos = 0
For intPtr = 1 To intLen1
intStartPos = intPos + 1
intPos = InStr(intStartPos, Str2, Mid$(Str1, intPtr, 1))
If intPos > 0 Then
If intPos > intStartPos + 3 Then 'No match if char is > 3 bytes away
intPos = intStartPos
Else
intScore = intScore + 1 'Update current score
End If
Else
intPos = intStartPos
End If
Next intPtr
End If
'-----------------------------------------------------------
'-- If Algorithm = 2 or 3, Search for pairs, triplets etc. --
'-----------------------------------------------------------
If (Algorithm And 2) <> 0 Then
For intCurLen = 2 To intLen1
StrWork = Str2 'Get a copy of String2
intTo = intLen1 - intCurLen + 1
intTotScore = intTotScore + Int(intLen1 / intCurLen) 'Update total possible score
For intPtr = 1 To intTo Step intCurLen
intPos = InStr(StrWork, Mid$(Str1, intPtr, intCurLen))
If intPos > 0 Then
Mid$(StrWork, intPos, intCurLen) = String$(intCurLen, &H0) 'corrupt found string
intScore = intScore + 1 'Update current score
End If
Next intPtr
Next intCurLen
End If
FuzzyPercent = intScore / intTotScore
End Function
Function FuzzyVLookup(ByVal LookupValue As String, _
ByVal TableArray As Range, _
ByVal IndexNum As Integer, _
Optional NFPercent As Single = 0.05, _
Optional Rank As Integer = 1, _
Optional Algorithm As Integer = 3) As Variant
'********************************************************************************
'** Function to Fuzzy match LookupValue with entries in **
'** column 1 of table specified by TableArray. **
'** TableArray must specify the top left cell of the range to be searched **
'** The function stops scanning the table when an empty cell in column 1 **
'** is found. **
'** For each entry in column 1 of the table, FuzzyPercent is called to match **
'** LookupValue with the Table entry. **
'** 'Rank' is an optional parameter which may take any value > 0 **
'** (default 1) and causes the function to return the 'nth' best **
'** match (where 'n' is defined by 'Rank' parameter) **
'** If the 'Rank' match percentage < NFPercent (Default 5%), #N/A is returned. **
'** IndexNum is the column number of the entry in TableArray required to be **
'** returned, as follows: **
'** If IndexNum > 0 and the 'Rank' percentage match is >= NFPercent **
'** (Default 5%) the column entry indicated by IndexNum is **
'** returned. **
'** if IndexNum = 0 and the 'Rank' percentage match is >= NFPercent **
'** (Default 5%) the offset row (starting at 1) is returned. **
'** This value can be used directly in the 'Index' function. **
'** **
'** Algorithm can take one of the following values: **
'** Algorithm = 1: **
'** For each character in 'String1', a search is performed on 'String2'. **
'** The search is deemed successful if a character is found in 'String2' **
'** within 3 characters of the current position. **
'** A score is kept of matching characters which is returned as a **
'** percentage of the total possible score. **
'** Algorithm = 2: **
'** A count of matching pairs, triplets, quadruplets etc. in 'String1' and **
'** 'String2' is returned as a percentage of the total possible. **
'** Algorithm = 3: Both Algorithms 1 and 2 are performed. **
'********************************************************************************
Dim strLookupValue As String
Dim strListString As String
Dim StrWork As String
Dim sngMinPercent As Single
Dim sngWork As Single
Dim sngCurPercent As Single
Dim intBestMatchPtr As Integer
Dim intPtr As Integer
Dim intRankPtr As Integer
Dim intRankPtr1 As Integer
Dim udRankData() As RankInfo
'--------------------------------------------------------------
'-- Validation --
'--------------------------------------------------------------
strLookupValue = Trim$(LCase$(LookupValue))
If IsMissing(NFPercent) Then
sngMinPercent = 0.05
Else
If (NFPercent <= 0) Or (NFPercent > 1) Then
FuzzyVLookup = "*** 'NFPercent' must be a percentage > zero ***"
Exit Function
End If
sngMinPercent = NFPercent
End If
If Rank < 1 Then
FuzzyVLookup = "*** 'Rank' must be an integer > 0 ***"
Exit Function
End If
ReDim udRankData(1 To Rank)
'---------------
'-- Main loop --
'---------------
intPtr = 1
Do While VarType(TableArray.Cells(intPtr, 1)) <> vbEmpty
If VarType(TableArray.Cells(intPtr, 1)) = vbString Then
strListString = Trim$(LCase$(TableArray.Cells(intPtr, 1)))
'------------------------------------------------
'-- Fuzzy match strings & get percentage match --
'------------------------------------------------
sngCurPercent = FuzzyPercent(String1:=strLookupValue, _
String2:=strListString, _
Algorithm:=Algorithm)
If sngCurPercent >= sngMinPercent Then
'---------------------------
'-- Store in ranked array --
'---------------------------
For intRankPtr = 1 To Rank
If sngCurPercent > udRankData(intRankPtr).Percentage Then
For intRankPtr1 = Rank To intRankPtr + 1 Step -1
With udRankData(intRankPtr1)
.Offset = udRankData(intRankPtr1 - 1).Offset
.Percentage = udRankData(intRankPtr1 - 1).Percentage
End With
Next intRankPtr1
With udRankData(intRankPtr)
.Offset = intPtr
.Percentage = sngCurPercent
End With
Exit For
End If
Next intRankPtr
End If
End If
intPtr = intPtr + 1
Loop
If udRankData(Rank).Percentage < sngMinPercent Then
'--------------------------------------
'-- Return '#N/A' if below NFPercent --
'--------------------------------------
FuzzyVLookup = CVErr(xlErrNA)
Else
intBestMatchPtr = udRankData(Rank).Offset
If IndexNum > 0 Then
'-----------------------------------
'-- Return column entry specified --
'-----------------------------------
FuzzyVLookup = TableArray.Cells(intBestMatchPtr, IndexNum)
Else
'-----------------------
'-- Return offset row --
'-----------------------
FuzzyVLookup = intBestMatchPtr
End If
End If
End Function
Function FuzzyHLookup(ByVal LookupValue As String, _
ByVal TableArray As Range, _
ByVal IndexNum As Integer, _
Optional NFPercent As Single = 0.05, _
Optional Rank As Integer = 1, _
Optional Algorithm As Integer = 3) As Variant
'********************************************************************************
'** Function to Fuzzy match LookupValue with entries in **
'** row 1 of table specified by TableArray. **
'** TableArray must specify the top left cell of the range to be searched **
'** The function stops scanning the table when an empty cell in row 1 **
'** is found. **
'** For each entry in row 1 of the table, FuzzyPercent is called to match **
'** LookupValue with the Table entry. **
'** 'Rank' is an optional parameter which may take any value > 0 **
'** (default 1) and causes the function to return the 'nth' best **
'** match (where 'n' is defined by 'Rank' parameter) **
'** If the 'Rank' match percentage < NFPercent (Default 5%), #N/A is returned. **
'** IndexNum is the row number of the entry in TableArray required to be **
'** returned, as follows: **
'** If IndexNum > 0 and the 'Rank' percentage match is >= NFPercent **
'** (Default 5%) the row entry indicated by IndexNum is **
'** returned. **
'** if IndexNum = 0 and the 'Rank' percentage match is >= NFPercent **
'** (Default 5%) the offset col (starting at 0) is returned. **
'** This value can be used directly in the 'OffSet' function. **
'** **
'** Algorithm can take one of the following values: **
'** Algorithm = 1: **
'** For each character in 'String1', a search is performed on 'String2'. **
'** The search is deemed successful if a character is found in 'String2' **
'** within 3 characters of the current position. **
'** A score is kept of matching characters which is returned as a **
'** percentage of the total possible score. **
'** Algorithm = 2: **
'** A count of matching pairs, triplets, quadruplets etc. in 'String1' and **
'** 'String2' is returned as a percentage of the total possible. **
'** Algorithm = 3: Both Algorithms 1 and 2 are performed. **
'********************************************************************************
Dim strLookupValue As String
Dim strListString As String
Dim StrWork As String
Dim sngMinPercent As Single
Dim sngWork As Single
Dim sngCurPercent As Single
Dim intBestMatchPtr As Integer
Dim intPtr As Integer
Dim intRankPtr As Integer
Dim intRankPtr1 As Integer
Dim udRankData() As RankInfo
'--------------------------------------------------------------
'-- Validation --
'--------------------------------------------------------------
strLookupValue = Trim$(LCase$(LookupValue))
If IsMissing(NFPercent) Then
sngMinPercent = 0.05
Else
If (NFPercent <= 0) Or (NFPercent > 1) Then
FuzzyHLookup = "*** 'NFPercent' must be a percentage > zero ***"
Exit Function
End If
sngMinPercent = NFPercent
End If
If Rank < 1 Then
FuzzyHLookup = "*** 'Rank' must be an integer > 0 ***"
Exit Function
End If
ReDim udRankData(1 To Rank)
'---------------
'-- Main loop --
'---------------
intPtr = 1
Do While VarType(TableArray.Cells(1, intPtr)) <> vbEmpty
If VarType(TableArray.Cells(1, intPtr)) = vbString Then
strListString = Trim$(LCase$(TableArray.Cells(1, intPtr)))
'------------------------------------------------
'-- Fuzzy match strings & get percentage match --
'------------------------------------------------
sngCurPercent = FuzzyPercent(String1:=strLookupValue, _
String2:=strListString, _
Algorithm:=Algorithm)
If sngCurPercent >= sngMinPercent Then
'---------------------------
'-- Store in ranked array --
'---------------------------
For intRankPtr = 1 To Rank
If sngCurPercent > udRankData(intRankPtr).Percentage Then
For intRankPtr1 = Rank To intRankPtr + 1 Step -1
With udRankData(intRankPtr1)
.Offset = udRankData(intRankPtr1 - 1).Offset
.Percentage = udRankData(intRankPtr1 - 1).Percentage
End With
Next intRankPtr1
With udRankData(intRankPtr)
.Offset = intPtr
.Percentage = sngCurPercent
End With
Exit For
End If
Next intRankPtr
End If
End If
intPtr = intPtr + 1
Loop
If udRankData(Rank).Percentage < sngMinPercent Then
'--------------------------------------
'-- Return '#N/A' if below NFPercent --
'--------------------------------------
FuzzyHLookup = CVErr(xlErrNA)
Else
intBestMatchPtr = udRankData(Rank).Offset
If IndexNum > 0 Then
'-----------------------------------
'-- Return row entry specified --
'-----------------------------------
FuzzyHLookup = TableArray.Cells(IndexNum, intBestMatchPtr)
Else
'-----------------------
'-- Return offset col --
'-----------------------
FuzzyHLookup = intBestMatchPtr
End If
End If
End Function
LookupValue := Entry to be looked up TableArray := Lookup Range IndexNum := Zero to return row number, >0 to return specified column (as in VLookup) NFPercent := If the %age match is less than this, return #N/A Rank := Return the 'n'th highest match. Default 1 Algorithm := Must be 1, 2 or 3. 1 = 'Forward scan only', '2' = Match on pairs then triplets then quads etc, '3' = match on both For your purposes, suggest a NFPercent of about 50% and Algorithm =1, but play! HTH Alan |
|
|
|
|
|
#5 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Join Date: Oct 2003
Posts: 1,818
|
hi!
I come up here with something like character checking. but at the end i come up with another question on how sensitve you want it to be? Please see below ******** ******************** ************************************************************************>
[HtmlMaker 2.42] To see the formula in the cells just click on the cells hyperlink or click the Name box PLEASE DO NOT QUOTE THIS TABLE IMAGE ON SAME PAGE! OTHEWISE, ERROR OF JavaScript OCCUR. the code: Private Function PatternMatching(s1, s2 As String) As Integer Dim i2 As Integer ****Max = 0 ****Count = 0 start: ****Count = 0 ****For i = 1 To Len(s1) ********i2 = i ********For j = 1 To Len(s2) ************If UCase(Mid(s2, j, 1)) = UCase(Mid(s1, i2, 1)) Then ****************i2 = i2 + 1 ****************Count = Count + 1 ****************If Max < Count Then ********************Max = Count ****************End If ************Else ****************Count = 0 ****************Exit For ************End If ********Next j ****Next i If Max < Len(s2) Then ****s2 = Mid(s2, 2, Len(s2)) GoTo start End If ****PatternMatching = Max End Function Private Sub CommandButton1_Click() For i = 2 To Range("A65536").End(xlUp).Row ****Range("b" & i).Value = PatternMatching(Range("e1"), Range("a" & i)) / Len(Range("e1")) Next i End Sub
__________________
There is always a better way!! |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
#6 |
|
Join Date: Aug 2002
Posts: 69
|
I just tried out Alan's impressive function but got a compile error in the following line:
Code:
Dim udRankData() As RankInfo Am I missing some lines? Jon |
|
|
|
|
|
#7 | |
|
MrExcel MVP
Join Date: Jul 2002
Posts: 32,030
|
Quote:
|
|
|
|
|
|
|
#8 |
|
MrExcel MVP
Join Date: Sep 2002
Location: Alabama/State of Disarray
Posts: 10,473
|
Did you copy over these lines as well as the functions --
Code:
Option Explicit
Type RankInfo
Offset As Integer
Percentage As Single
End Type
|
|
|
|
|
|
#9 |
|
Join Date: Aug 2002
Posts: 69
|
I did actually. It was lying together with other functions in the same module. It worked when I placed all of the code in a seperate module. The strange thing is, that I only get acces to the FuzzyHLookup and not the FuzzyVLookup in the Functions library even though it is there in the code.
To answer the question of SIXTH SENSE on how sensitive I want it to be I think it would be great to be able to change the sensitivity according to the different needs. More or less like in Alans function. Jon |
|
|
|
|
|
#10 |
|
Join Date: Jul 2003
Location: Manchester (UK)
Posts: 3,497
|
Note that you need FuzzyPercent too - it's called buy FuzzyVLookup.
Regarding SIXTH SENSE's comment 'how sensitive do you want it to be' - this is controlled by the NFPercent parameter. HTH Alan |
|
|
|
![]() |
| Bookmarks |
| Thread Tools | |
| Display Modes | |
|
|