![]() |
![]() |
|
|||||||
| Excel Questions All Excel/VBA questions - formulas, macros, pivot tables, general help, etc. Please post to this forum in English only. |
![]() |
|
|
Thread Tools | Display Modes |
|
|
#1 |
|
New Member
Join Date: Apr 2002
Posts: 7
|
Hi,
I have a list of five hundred companies on a spreadsheet called "Companies" and another spreadsheet called "Public" with all 7,000+ public companies and their associated ticker symbols. I'm trying to match on the company names of both lists (column A of both spreadsheets) and return the associated ticker symbol. However, the names don't match precisely. Some names have ", Inc." (with a comma and space) or ", Corp." at the end, while some don't have any suffix or have “Inc.” without a comma. To make matters more complicated, some have the word "Corporation" spelled out completely instead of the ", Corp." abbreviation. For example, on one list Microsoft is simply "Microsoft", while on the other list it's called "Microsoft Corporation". Using “true” or 1 in the fourth parameter of VLOOKUP doesn’t help. The matches I get are close, but no cigar. Excel spits out a match but not the right one. Any ideas on how to solve this without resorting to brute force, manual lookups? Thanks |
|
|
|
|
|
#2 |
|
MrExcel MVP
Join Date: Mar 2002
Location: Chicago, IL USA
Posts: 2,042
|
Hi,
Clean your lists. Copy and then combine them, sort by ticker symbol and choose one asset description per ticker. Then, do a vlookup by ticker on the original lists. When you are satisfied, copy and pastespecial (values) on the original columns and you should be OK. Possibly a pivot table can be used. If not, you will be using Vlookup or Index/Match with something like "LEFT(cell,9)="Microsoft" which will become a big mess very quickly. Take the time to clean the data and you will be much happier in the long run, I believe. Bye, Jay |
|
|
|
|
|
#3 |
|
MrExcel MVP
Join Date: Mar 2002
Location: Michigan USA
Posts: 11,452
|
I recall use of soundex feature in such cases ... but I am fuzzy on this at the moment. Perhaps some Excel Gurus will shed some light on this!
|
|
|
|
|
|
#4 | |
|
MrExcel MVP
Join Date: Mar 2002
Location: Chicago, IL USA
Posts: 2,042
|
Quote:
Nice suggestion. I just searched google.com and found the following: http://www.myatt.demon.co.uk/soundex.htm Has the soundex application in Access VBA Also, Harlan Grove posted the following ----------------------------------------- Subject: Re: 'fuzzy' string comparison? Date: Wed, 21 Jun 2000 12:36:05 -0700 From: Harlan Grove Newsgroups: comp.apps.spreadsheets I had to try. This gives a score. Maybe too deterministic, and definitely too dumb with respect to common abbreviations. Function StrSimilar(s1 As String, s2 As String) As Double Dim i As Long, j As Long, k As Long, n(2) As Long Dim c1 As String, c2 As String Const alphanum As String = "123456789abcdefghijklmnopqrstuvwxyz" s1 = LCase(s1) For i = 1 To Len(s1) If Not InStr(alphanum, Mid(s1, i, 1)) > 0 Then Mid(s1, i, 1) = " " Next i s1 = Application.WorksheetFunction.Trim(s1) s2 = LCase(s2) For j = 1 To Len(s2) If Not InStr(alphanum, Mid(s2, j, 1)) > 0 Then Mid(s2, j, 1) = " " Next j s2 = Application.WorksheetFunction.Trim(s2) j = 1 n(1) = 0 For i = 1 To Len(s1) c1 = LCase(Mid(s1, i, 1)) k = 0 Do c2 = LCase(Mid(s2, j + k, 1)) k = k + 1 Loop Until j + k > Len(s2) Or c1 = c2 If c1 = c2 Then n(1) = n(1) + 1 If j < Len(s2) Then j = j + 1 Else Exit For End If Next i i = 1 n(2) = 0 For j = 1 To Len(s2) c2 = LCase(Mid(s2, j, 1)) k = 0 Do c1 = LCase(Mid(s1, i + k, 1)) k = k + 1 Loop Until i + k > Len(s1) Or c1 = c2 If c1 = c2 Then n(2) = n(2) + 1 If i < Len(s1) Then i = i + 1 Else Exit For End If Next j StrSimilar = CDbl(Application.WorksheetFunction.Min(n(1), n(2))) / _ CDbl(Application.WorksheetFunction.Max(Len(s1), Len(s2))) End Function ------------------------------------- Don't think this will particularly work for the OP, but it may be worth a shot. Bye, Jay [ This Message was edited by: Jay Petrulis on 2002-04-22 18:29 ] |
|
|
|
|
|
|
#5 |
|
MrExcel MVP
Join Date: Mar 2002
Location: Michigan USA
Posts: 11,452
|
Oh my goodness Jay:
You are on top of things buddy! ... looks like you have got things on your finger tips. Regards! Yogi Anand |
|
|
|
|
|
#6 |
|
MrExcel MVP
Join Date: Mar 2002
Location: Chicago, IL USA
Posts: 2,042
|
Hi Yogi,
No credit to me. I have looked at a lot of Harlan's posts in great detail. He is awesome. He is the only one I know that can hang with the elite of the computer scientists, the engineers, the finance gurus and the statisticians. I have no idea about his background but he knows his stuff. (He also fights with everybody, but nobody's perfect!) Bye, Jay |
|
|
|
|
|
#7 |
|
New Member
Join Date: Apr 2002
Posts: 7
|
Jay and Yogi, thanks so much for the ideas. I genuinely appreciate it.
|
|
|
|
|
|
#8 | |
|
MrExcel MVP
Join Date: Feb 2002
Location: The Hague
Posts: 50,317
|
Quote:
http://www.mrexcel.com/wwwboard/messages/8546.html for a possible approach. |
|
|
|
|
![]() |
| Bookmarks |
| Thread Tools | |
| Display Modes | |
|
|