Fuzzy Matching - new version plus explanation

al_b_cnu

Well-known Member
Joined
Jul 18, 2003
Messages
4,545
It has been a while since I originally posted my Fuzzy matching UDF’s on the board, and several variants have appeared subsequently.

I thought it time to ‘put the record straight’ & post a definitive version which contains slightly more efficient code, and better matching algorithms, so here it is.

Firstly, I must state that the Fuzzy matching algorithms are very CPU hungry, and should be used sparingly. If for instance you require to lookup a match for a string which starts with, contains or ends with a specified value, this can be performed far more efficiently using the MATCH function:
Fuzzy Examples.xls
ABCDE
1Starts WithEndsContains
2BilljelenBill
3Mr Bill Jelen433
4Bill Jelen
5Joe Bloggs
6Fred Smith
MATCH Example


... Continued ...
 
Hi Ankush, welcome to the board!

The usual reason for #Name? is that the code has not been pasted into a code module

From the VBA Editor menu, insert a module, this will be named 'Module1' Paste the code into that module (link to latest code in my signature).
 
Upvote 0

Excel Facts

Why does 9 mean SUM in SUBTOTAL?
It is because Sum is the 9th alphabetically in Average, Count, CountA, Max, Min, Product, StDev.S, StDev.P, Sum, VAR.S, VAR.P.
Hi

This has been working fine for me until recently when ever i do a fuzzyvlookup it just returns the values from the look up array in order rather than the values id expect it find. eg. if i try fuzzyvlookup Astin martin (holdings) it finds summ which is just the first value in the look up array it does this for all values. Any one able to help? Im using this function to try and match a list of companies with a much larger data base of companies to see if i already have details for the companies im searching so for example id want astin martin (holdings) to match with astin martin ltd.
 
Upvote 0
Hi,

I suspect it's an anchoring problem, but can't say for sure without further info.

Can you post a sample of your FuzzyVLookup call and of your table, preferably via something like MrExcel HTMLMaker utility?
 
Upvote 0
Hi al_b_cnu,

First of all, tremendous help you've provided so far in sharing your code to everybody.

i am looking to modify algorithm 2, but do not have much experience with excel vba or excel help forums. Instead of pairs, triplets etc, it recognises each word in a data entry and compares it with the number of correct matches in each description in the database. the highest number of word matches get selected. if within the data-base, all the database entries have the same number of word matches, choose the highest percentage match like in your fuzzypercent algorithm.


e.g. currently with your algorithm 2.

user entry:
AIR CLEANER FILTER

database contains:
AIR CLEANER, PARTICULATE
AIR CLEANER, PARTICULATE PRECIPITATION
AIR CLEANER PARTICULATE ELECTROSTATIC PRECIPITATION, MOBILE
AIR CLEANER PARTICULATE HIGH-EFFICIENCY FILTER

<colgroup><col></colgroup><tbody>
</tbody>


Result:
AIR CLEANER , PARTICULATE

Desired:
AIR CLEANER PARTICULATE HIGH-EFFICIENCY FILTER

Thanks alot,
Jason
 
Upvote 0
Hey all, this isn't an answer to any particular question but how does this cool VBA compare, e.g., against the 'Fuzzy Lookup'
 
Upvote 0
Oops, ran out of editing time, there. Here's how the comment above should have looked:
Hey all, how does this cool, fuzzy VBA compare, e.g., against the 'Fuzzy Lookup' add-in from the Microsoft Research unit (or any other commercial product, for that matter)? I've used MS's applet to compare competitor product titles and it definitely helps, but there's not flexibility built in. I, for instance, can't require that the titles have the same brand. Anyone have experience with commercial fuzzy lookup products?

Download: Fuzzy Lookup Add-In for Excel - Microsoft Download Center - Download Details

Thanks! --patrick
 
Upvote 0
I've been meaning to try these fuzzy functions for awhile and just now started......I'm doing something wrong. Before seeing the new version I pasted the old version into a module...seemed to get somewhat desired results but was taking waaaaay too long (but then I'm looking at a list of some 30,000 names)...so I thought I might just have to play with the last number a bit (started out with .5). Then as I was looking to see what others were using, I discovered the updated version. But...I'm getting a NAME error...and yes, the functions are in a module. What am I missing?
 
Upvote 0
Hi sitewolf,

Silly question, but are you enabling macros? I found that in Excel 2007, it was all too easy to miss the little [Options] button, start entering data and then find that you can't run any macros.

If still having a problem, by all means pm me, I'll give you my email for you to send a sample file.

30,000 FuzzyVLookups may take some time (as "Deep Thought" said once) :)
 
Upvote 0
I got sidetracked in other directions, but got back to checking all this out today....kinda got it working, but also not quite sure of the options available. I saw someone post their formula in this thread back in 2010 that had more involved than I'd tried, and was curious....

quote from that post:
For example, in my project, I used this formula:

FuzzyVLookup(B3,LIST!$P$2:$P$1680,1,40%,1,2,0)

which translates by: "search (B3) in range (LIST!$P$2:$P$1680), return 1st column, use a 40% percentage confidence index, use default Rank(1), use algorithm 2, no additional column"

---
I had been using something like .9 rather than 90%..same diff I guess...but closing my formula there.
Could someone explain the 3 additional items further, please? More detail on what they are, what options are available, and when you might use each?
 
Upvote 0

Forum statistics

Threads
1,215,500
Messages
6,125,168
Members
449,211
Latest member
ykrcory

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top