Similar texts with minor differences - How to get just ONE (a standard one)

heytoluca

New Member
Joined
May 26, 2014
Messages
9
Hello,


I have a very large database that contains names with minor differences on column A such as:
A
EXCEL AMERICAN COMPANY S.A.
EXCEL AMERICAN COMPANY SA.
EXCEL AMERICAN COMPANY, S.A.
ASSOCIATION, INDUSTRIAL
ASSOCIATION INDUSTRIAL
L.A. COMPANY CO
LA COMPANY CO


As you can see, this makes a mess when dealing with numbers associated with them and makes analysis more difficult.
What I've been trying to figure out is how to quickly get in column B a standard name for each one, for example:
B
EXCEL AMERICAN COMPANY S.A.
ASSOCIATION INDUSTRIAL
L.A. COMPANY CO


I was thinking of using a combination of the FIND and EXTRACT formula or maybe LEFT as well. (Maybe VB?)
Please use your excel wizardry to help a fellow comrade!


David
 

Excel Facts

Save Often
If you start asking yourself if now is a good time to save your Excel workbook, the answer is Yes
Hello David

You can eliminate the comma and the period to get a "standard" name.
AB
1EXCEL AMERICAN COMPANY S.A.EXCEL AMERICAN COMPANY SA
2EXCEL AMERICAN COMPANY SA.EXCEL AMERICAN COMPANY SA
3EXCEL AMERICAN COMPANY, S.A.EXCEL AMERICAN COMPANY SA
4ASSOCIATION, INDUSTRIALASSOCIATION INDUSTRIAL
5ASSOCIATION INDUSTRIALASSOCIATION INDUSTRIAL
6L.A. COMPANY COLA COMPANY CO
7LA COMPANY COLA COMPANY CO

<colgroup><col style="width: 28ptpx"><col width="198,75pt"><col width="154,5pt"></colgroup><tbody>
</tbody>

ZelleFormel
B1=SUBSTITUTE(SUBSTITUTE(A1,".",""),",","")

<colgroup><col style="width: 40ptpx"><col></colgroup><tbody>
</tbody>
Diese Tabelle wurde mit Tab2Html (v2.4.1) erstellt. ©Gerd alias Bamberg

<tbody>
</tbody>
 
Upvote 0
Hello David

You can eliminate the comma and the period to get a "standard" name.
AB
1EXCEL AMERICAN COMPANY S.A.EXCEL AMERICAN COMPANY SA
2EXCEL AMERICAN COMPANY SA.EXCEL AMERICAN COMPANY SA
3EXCEL AMERICAN COMPANY, S.A.EXCEL AMERICAN COMPANY SA
4ASSOCIATION, INDUSTRIALASSOCIATION INDUSTRIAL
5ASSOCIATION INDUSTRIALASSOCIATION INDUSTRIAL
6L.A. COMPANY COLA COMPANY CO
7LA COMPANY COLA COMPANY CO

<tbody>
</tbody>

ZelleFormel
B1=SUBSTITUTE(SUBSTITUTE(A1,".",""),",","")

<tbody>
</tbody>
Diese Tabelle wurde mit Tab2Html (v2.4.1) erstellt. ©Gerd alias Bamberg

<tbody>
</tbody>

Thanks a lot!, this effectively helped standarize the names.

However, I still have a name or two that are still not standarized. For example:

DURACORP S.A.
DURACORP S. A.

Results in:

DURACORPS SA
DURACORPS S A

Is there anything I can add to the formula to get:

DURACORPS SA

On both?



Thanks a LOT
David
 
Upvote 0
Hi David

Code:
[COLOR=#222222]=SUBSTITUTE[/COLOR][COLOR=#0000dd](SUBSTITUTE[/COLOR][COLOR=#222222](SUBSTITUTE[/COLOR][COLOR=#0000dd](A1,".","")[/COLOR][COLOR=#222222],",","")[/COLOR][COLOR=#0000dd],"S  A","SA")[/COLOR]
 
Upvote 0
Hi David

Code:
[COLOR=#222222]=SUBSTITUTE[/COLOR][COLOR=#0000dd](SUBSTITUTE[/COLOR][COLOR=#222222](SUBSTITUTE[/COLOR][COLOR=#0000dd](A1,".","")[/COLOR][COLOR=#222222],",","")[/COLOR][COLOR=#0000dd],"S  A","SA")[/COLOR]

Many thanks!

You made me realize I can add as many "filters" I want to that formula.

However (sorry if Im becoming annoying), I just noticed there are a few names that cant fit into the fit-all formula we're developing, for example:

EXCEL CORPORATION LUMBER DBE
EXCEL CORPORATION LU DBE
QUICK CO AND FIXTURES
QUICK COAND FIXTURES

Any help with these? Maybe a LEFT formula?

Thanks a LOT!
David
 
Upvote 0
Hi

I would suggest eliminating the spaces and then take the n leftmost characters.
In my example n is 10.
Code:
[COLOR=#222222]=LEFT[/COLOR][COLOR=#0000dd](SUBSTITUTE[/COLOR][COLOR=#222222](SUBSTITUTE[/COLOR][COLOR=#0000dd](SUBSTITUTE[/COLOR][COLOR=#222222](A1,".","")[/COLOR][COLOR=#0000dd],",","")[/COLOR][COLOR=#222222],"  ","")[/COLOR][COLOR=#0000dd],10)[/COLOR]
 
Upvote 0
Hi

I would suggest eliminating the spaces and then take the n leftmost characters.
In my example n is 10.
Code:
[COLOR=#222222]=LEFT[/COLOR][COLOR=#0000dd](SUBSTITUTE[/COLOR][COLOR=#222222](SUBSTITUTE[/COLOR][COLOR=#0000dd](SUBSTITUTE[/COLOR][COLOR=#222222](A1,".","")[/COLOR][COLOR=#0000dd],",","")[/COLOR][COLOR=#222222],"  ","")[/COLOR][COLOR=#0000dd],10)[/COLOR]

Hey! Thanks a lot for your answer and your time. This worked great!

Thanks again!
David
 
Upvote 0

Forum statistics

Threads
1,215,945
Messages
6,127,861
Members
449,411
Latest member
adunn_23

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top