Search for similar records

seriousdamage

Board Regular
Joined
Aug 14, 2005
Messages
58
Hi All,
I have an excel file with about 30.000 company names, which I have sorted in alphabetical order. My goal is to find a relationship between this accounts only by looking at the name and see if they NEARLY match, so that I can consider them as 1 company and reduce the list.

Is there a formula that I can run to help me out?

I give you an example of what I would consider the same company.

UNION SDA FINANCE,
UNION DES SUCRERIES ET DISTILLERIES SDA FINANCE AGRICOLES,
RECTORAT DES SDA FINANCE SUCRERIES ET DISTILLERIES.

As you can see in all 3 records there is the word SDA FINANCE always present, which brings me to think that they could all be connected.
If then they are not that will be a different issue :)

Any help?
Thanks
NIC.
 

Excel Facts

Round to nearest half hour?
Use =MROUND(A2,"0:30") to round to nearest half hour. Use =CEILING(A2,"0:30") to round to next half hour.

fairwinds

MrExcel MVP
Joined
May 15, 2003
Messages
8,638
Hi,

Try:

=MIN(MATCH("*"&MID(A2,ROW(INDIRECT("1:"&LEN(A2)-10)),10)&"*",$A$2:$A$10,0))

Confirmed with Ctrl + shift + enter in B2 and dragged down.

This formula takes each 10 letter string (can be changed) out and returns a number indicating the first match within the list.

Sorting or filtering by these numbers might give you what you want.

Ofcorse there is a great chance that you will get matches that are not valid in reality.
Book1
ABCD
1
2UNION SDA FINANCE,1
3xxxxxxxxxxxxxxxx2
4yyyyyyyyyyyyyyyyyyyy3
5Some other company4
6UNION DES SUCRERIES ET DISTILLERIES SDA FINANCE AGRICOLES,1
7Some other company4
8Yet another7
9UNION DES SUCRERIES ET DISTILLERIES SDA FINANCE AGRICOLES,1
10Yet another7
Sheet4
 
Master Excel Bundle

Excel contains over 450 functions, with more added every year. That’s a huge number, so where should you start? Right here with this bundle.

Forum statistics

Threads
1,168,117
Messages
5,857,482
Members
431,882
Latest member
saaaaaaaaaaaaaaaaaaaaaa

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top