MrExcel Message Board

Go Back   MrExcel Message Board > Question Forums > Excel Questions

Excel Questions All Excel/VBA questions - formulas, macros, pivot tables, general help, etc. Please post to this forum in English only.

Reply
 
Thread Tools Display Modes
Old Apr 22nd, 2002, 05:37 PM   #1
Shumande
New Member
 
Join Date: Apr 2002
Posts: 7
Default

Hi,

I have a list of five hundred companies on a spreadsheet called "Companies" and another spreadsheet called "Public" with all 7,000+ public companies and their associated ticker symbols. I'm trying to match on the company names of both lists (column A of both spreadsheets) and return the associated ticker symbol. However, the names don't match precisely. Some names have ", Inc." (with a comma and space) or ", Corp." at the end, while some don't have any suffix or have “Inc.” without a comma. To make matters more complicated, some have the word "Corporation" spelled out completely instead of the ", Corp." abbreviation. For example, on one list Microsoft is simply "Microsoft", while on the other list it's called "Microsoft Corporation".

Using “true” or 1 in the fourth parameter of VLOOKUP doesn’t help. The matches I get are close, but no cigar. Excel spits out a match but not the right one.

Any ideas on how to solve this without resorting to brute force, manual lookups?

Thanks
Shumande is offline   Reply With Quote
Old Apr 22nd, 2002, 07:16 PM   #2
Jay Petrulis
MrExcel MVP
 
Jay Petrulis's Avatar
 
Join Date: Mar 2002
Location: Chicago, IL USA
Posts: 2,042
Default

Hi,

Clean your lists. Copy and then combine them, sort by ticker symbol and choose one asset description per ticker.

Then, do a vlookup by ticker on the original lists. When you are satisfied, copy and pastespecial (values) on the original columns and you should be OK.

Possibly a pivot table can be used.

If not, you will be using Vlookup or Index/Match with something like "LEFT(cell,9)="Microsoft" which will become a big mess very quickly.

Take the time to clean the data and you will be much happier in the long run, I believe.

Bye,
Jay

Jay Petrulis is offline   Reply With Quote
Old Apr 22nd, 2002, 07:16 PM   #3
Yogi Anand
MrExcel MVP
 
Join Date: Mar 2002
Location: Michigan USA
Posts: 11,452
Default

I recall use of soundex feature in such cases ... but I am fuzzy on this at the moment. Perhaps some Excel Gurus will shed some light on this!
Yogi Anand is offline   Reply With Quote
Old Apr 22nd, 2002, 07:28 PM   #4
Jay Petrulis
MrExcel MVP
 
Jay Petrulis's Avatar
 
Join Date: Mar 2002
Location: Chicago, IL USA
Posts: 2,042
Default

Quote:
On 2002-04-22 18:16, Yogi Anand wrote:
I recall use of soundex feature in such cases ... but I am fuzzy on this at the moment. Perhaps some Excel Gurus will shed some light on this!
Hi Yogi,

Nice suggestion. I just searched google.com and found the following:

http://www.myatt.demon.co.uk/soundex.htm

Has the soundex application in Access VBA

Also, Harlan Grove posted the following

-----------------------------------------
Subject: Re: 'fuzzy' string comparison?
Date: Wed, 21 Jun 2000 12:36:05 -0700
From: Harlan Grove
Newsgroups: comp.apps.spreadsheets

I had to try. This gives a score. Maybe too deterministic, and definitely too dumb with respect to common abbreviations.

Function StrSimilar(s1 As String, s2 As String) As Double
Dim i As Long, j As Long, k As Long, n(2) As Long
Dim c1 As String, c2 As String
Const alphanum As String = "123456789abcdefghijklmnopqrstuvwxyz"

s1 = LCase(s1)
For i = 1 To Len(s1)
If Not InStr(alphanum, Mid(s1, i, 1)) > 0 Then Mid(s1, i, 1) = " "
Next i
s1 = Application.WorksheetFunction.Trim(s1)

s2 = LCase(s2)
For j = 1 To Len(s2)
If Not InStr(alphanum, Mid(s2, j, 1)) > 0 Then Mid(s2, j, 1) = " "
Next j
s2 = Application.WorksheetFunction.Trim(s2)

j = 1
n(1) = 0
For i = 1 To Len(s1)
c1 = LCase(Mid(s1, i, 1))

k = 0
Do
c2 = LCase(Mid(s2, j + k, 1))
k = k + 1
Loop Until j + k > Len(s2) Or c1 = c2

If c1 = c2 Then
n(1) = n(1) + 1
If j < Len(s2) Then j = j + 1 Else Exit For
End If
Next i

i = 1
n(2) = 0
For j = 1 To Len(s2)
c2 = LCase(Mid(s2, j, 1))

k = 0
Do
c1 = LCase(Mid(s1, i + k, 1))
k = k + 1
Loop Until i + k > Len(s1) Or c1 = c2

If c1 = c2 Then
n(2) = n(2) + 1
If i < Len(s1) Then i = i + 1 Else Exit For
End If
Next j

StrSimilar = CDbl(Application.WorksheetFunction.Min(n(1), n(2))) / _
CDbl(Application.WorksheetFunction.Max(Len(s1), Len(s2)))
End Function
-------------------------------------

Don't think this will particularly work for the OP, but it may be worth a shot.

Bye,
Jay

[ This Message was edited by: Jay Petrulis on 2002-04-22 18:29 ]
Jay Petrulis is offline   Reply With Quote
Old Apr 22nd, 2002, 07:32 PM   #5
Yogi Anand
MrExcel MVP
 
Join Date: Mar 2002
Location: Michigan USA
Posts: 11,452
Default

Oh my goodness Jay:
You are on top of things buddy! ... looks like you have got things on your finger tips.

Regards!

Yogi Anand
Yogi Anand is offline   Reply With Quote
Old Apr 22nd, 2002, 07:43 PM   #6
Jay Petrulis
MrExcel MVP
 
Jay Petrulis's Avatar
 
Join Date: Mar 2002
Location: Chicago, IL USA
Posts: 2,042
Default

Hi Yogi,

No credit to me. I have looked at a lot of Harlan's posts in great detail. He is awesome.

He is the only one I know that can hang with the elite of the computer scientists, the engineers, the finance gurus and the statisticians. I have no idea about his background but he knows his stuff. (He also fights with everybody, but nobody's perfect!)

Bye,
Jay
Jay Petrulis is offline   Reply With Quote
Old Apr 22nd, 2002, 09:37 PM   #7
Shumande
New Member
 
Join Date: Apr 2002
Posts: 7
Default

Jay and Yogi, thanks so much for the ideas. I genuinely appreciate it.
Shumande is offline   Reply With Quote
Old Apr 22nd, 2002, 10:49 PM   #8
Aladin Akyurek
MrExcel MVP
 
Aladin Akyurek's Avatar
 
Join Date: Feb 2002
Location: The Hague
Posts: 50,317
Default

Quote:
On 2002-04-22 16:37, Shumande wrote:
Hi,

I have a list of five hundred companies on a spreadsheet called "Companies" and another spreadsheet called "Public" with all 7,000+ public companies and their associated ticker symbols. I'm trying to match on the company names of both lists (column A of both spreadsheets) and return the associated ticker symbol. However, the names don't match precisely. Some names have ", Inc." (with a comma and space) or ", Corp." at the end, while some don't have any suffix or have “Inc.” without a comma. To make matters more complicated, some have the word "Corporation" spelled out completely instead of the ", Corp." abbreviation. For example, on one list Microsoft is simply "Microsoft", while on the other list it's called "Microsoft Corporation".

Using “true” or 1 in the fourth parameter of VLOOKUP doesn’t help. The matches I get are close, but no cigar. Excel spits out a match but not the right one.

Any ideas on how to solve this without resorting to brute force, manual lookups?

Thanks
Also have a look at:

http://www.mrexcel.com/wwwboard/messages/8546.html

for a possible approach.
Aladin Akyurek is offline   Reply With Quote
Reply

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is On

Forum Jump


All times are GMT -4. The time now is 06:36 AM.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2012, vBulletin Solutions, Inc.
All contents Copyright 1998-2012 by MrExcel Consulting.
diabetic desserts recipes recipes Diabetic Soups Holiday Pizza Recipes Popcorn Recipes Recipes For Microwave Pasta Recipes Casserole Recipes Chili Recipes Curry Recipes Crockpot Recipes Apples Recipes Bread Recipes Vegetarian Recipes Vegetable recipes Desserts Recipes Appetizers Ethnic Recipes Meat Dishes Barbecue Recipes Sauces Recipes Marinade Recipes Low Fat Recipes Frugal Gourmet Kitchen Classics Recipes On The Grill Cook Books Seafood Recipes Cajun Recipes Breads Low Fat Low Fat Breads Bread Machine Recipes Yeast Breads Quick Breads Fat Free Vegetarian Salad Recipes Eggplant Recipes Radish Recipes Tomato Recipes Jalapeno Recipes Potato Recipes Lettuce Recipes Cabbage Recipes Beans Ambrosia Recipes Biscotti Recipes Desserts Low Fat Cookie Recipes Cheesecake Recipes Cake Recipes Pie Recipes Muffin Recipes Custard Recipes Best Appetizers Appetizers Low Fat Salsa Recipes Dip Recipes International Recipes Afghan Recipes Alaska Recipes French Recipes German Recipes Greek Recipes Italian Recipes Spanish Recipes Thai Recipes Korean Recipes Chinese Recipes Mexican Recipes Indian Recipes Beef Recipes Pork Pork & Ham Pork Butts Pork Chop Recipes Pork Ribs Rulled Pork Poultry Recipes Stews Recipes Ground Beef Barbecue Grill Barbecue Smoker All Purpose Sauce BBQ Sauce Barbecue Sauce Carolina BBQ Sauce Pickle Recipes Marinades Smoking Low Fat Appetizers & Dips Low Fat Breakfast Low Fat Cakes Low Fat Cheesecakes Low Fat Cookies Low Fat Desserts Low Fat Fish & Seafood Low Fat Meats Low Fat Pasta Low Fat Pies Low Fat Salads Low Fat Sandwiches Low Fat Sauces & Condiments Low Fat Sides Low Fat Soups Low Fat Vegetarian Baker's Dozen Taste of Home Recipe Book Bon Appetit Cookbook Blacktie Cookbook Buster Cook Book Cookbook USA Cook Book Cook Book Sara's Cookbook Sara's Cookbook Appetizers and Dips Poultry recipes Diabetic recipes Holiday recipes Miscellaneous recipes 110 recipes 1986 Usenet cookbook 2900 recipes Cyberrealm recipes Great sysops of world Specialty recipes Ceideburg recipes Cheese recipes Chili recipes Fruits recipes Garlic recipes Great chefs of NY Londontowne recipes Raisins recipes Recipes for kids US Food Vegetarian recipes Bread recipes Drinks Meat Dishes Brisket recipes Caribou recipes Chicken recipes Filet mignons recipes Pork recipes Swordfish recipes Turkey recipes Pasta recipes Uncategorized recipes Ethnic recipes Canada recipes English recipes Ethiopia recipes Germany recipes Greece recipes Mexican recipes Philippines recipes Welsh recipes Microwave recipes Soups recipes Vegetable recipes Asparagus recipes Barley recipes Brown rice recipes Lentil recipes Mushrooms recipes Salads recipes Wild rice Desserts recipes Cakes recipes Chocolate recipes Cookies recipes Ice cream recipes