Code to return closest text match

Zakkala

Active Member
Joined
Nov 12, 2004
Messages
254
Hiya,

I need a bit of help with comparing two sets of data please - I've tried the fuzzyvlookup function but it doesn't seem to be quite what I need (and to be honest, the size of the workbook means a lot of formulae slow it down too much) so I'm trying to do this via macro code.

So, I'm comparing data on two sheets which share one common field in order to pull through additional fields that are required to complete one data set - although the contents of the shared field may not be spelt quite the same on both sheets. The aim is to check if an entry on sheet 1 is present on sheet 2 and then pull some additional information from sheet 2 back to blank columns on the original sheet.

I have code that will do this for some close matches but not others so not sure how to improve it (can't post the exact code right now I'm afraid as I'm at home and it's at work - but it's bugging me so much I had to see if anyone has any advice before I get back to work!).

Example:

Data would be:

Sheet 1 - Columns A, B & C are blank for the returned data to go into. Column D has irrelevant data. Column D has the company names (the common field). Columns E onwards are irrelevant. Amount of data can be hundreds or thousands of rows - not every company listed will be found on Sheet 2 - these just remain with blank fields.

Sheet 2 - Column A contains irrelevant data. The common field is either Column B (contains the main company name) or Column C (contains a possible different spelling of the name). At present, my code checks both these columns. Data to return to sheet 1 when the company is found is the company name it picks up from B or C into Column C on sheet 1 and, say, Columns D & E. Data in this sheet will be around 40 thousand rows.


The code I have at present will pick up on a name on Sheet 1 that is similar to the entry on Sheet 2. If a name on Sheet 1 is, for example, "Woodstock Trust" and on Sheet 2 it is "Woodstock Trust Limited", my current code finds it fine. If on Sheet 1 it is "Wood Stock Trust" and on Sheet 2 it is "Wood-stock Trust" or "Woodstock Fund" it does not. I would want it to find these - as it pulls through the name it checks against, a final visual check by the user will allow them to disregard anything that isn't needed, but I would rather find similar names than miss any, if you know what I mean.

Can anyone give any guidance please?
 

Excel Facts

Formula for Yesterday
Name Manager, New Name. Yesterday =TODAY()-1. OK. Then, use =YESTERDAY in any cell. Tomorrow could be =TODAY()+1.
I might recommend creating a third dataset as the result.
In your example, you could have 4 possible results to matching, which may open up risk of missing something.

If there is anything else in the irrelevant data, such as address, postal code, phone number or even area code to use as validation, make every attempt to use those as a second/third validation.

Using Replace() to omit the non-alphabet characters and providing some tolerance %
to match has also worked for me.
ie Run through a series of Replace() on the string being saught, run through a series of Replace() on the data source, and comparing the Left tolerance% will generally provide more matches.
It's process heavy, but when its critical to discover every possibility...

There is also a series of Hall of Fame posts on Mr Excel related to Fuzzy Matching:
http://www.mrexcel.com/forum/showthread.php?t=69649
 
Upvote 0

Forum statistics

Threads
1,224,590
Messages
6,179,749
Members
452,940
Latest member
rootytrip

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top