Too hard basket (comparing text strings)

Posted by Carl on June 27, 2001 6:33 PM

Just wondering whether the problem in the 'comparing text strings' and follow up postings below is in the too hard basket or whether anyone has any ideas? Thanks for your help thus far Mark W.

Thanks,
Carl.

Posted by Damon Ostrander on June 27, 2001 11:38 PM

Hi Carl,

Are you interested in entertaining VBA-based solutions? It appears to me that it would be simple to write a VBA function that given two strings would yield the number of "orthographic pairs."

Damon

Posted by Aladin Akyurek on June 27, 2001 11:51 PM

Hi Damon,

Indeed, it's hard to find a formula-based solution for Carl's problem (because of repeats of letters in strings that you have to compare). And, if there is such a solution, it's probably not worth the effort.
I was considering to propose Carl to look for a VBA approach.

Posted by Damon Ostrander on June 28, 2001 9:25 AM

Re:comparing text string: Here's a VBA function that does it

Here is a little VBA function that does it. From Excel you would just use it in a cell:

=NUMOPAIRS(C5,D5)

to compare the strings in cells C5 and D5.

This fuction must be put in a macro module (NOT into the worksheet's event code area).

'______________________________________________

Function NumOPairs(A As String, B As String) As Integer

' Returns the number of "Orthographic pairs" between strings A and B
' Author: Damon Ostrander
' Date: 6/28/02

Dim chA As String * 1
Dim chB As String * 1

NumOPairs = 0

For i = 1 To Len(A)
chA = Mid(A, i, 1)
For j = 1 To Len(B)
chB = Mid(B, j, 1)
If chA = chB Then
NumOPairs = NumOPairs + 1
'Remove this character from B string
B = Left(B, j - 1) & Right(B, Len(B) - j)
Exit For
End If
Next j
Next i

End Function

Posted by Aladin Akyurek on June 28, 2001 9:35 AM

Re:comparing text string: Here's a VBA function that does it

Damon,

Does it count the letters common to both strings?

For example, the count must be 3 if ROOF and ROTOR are the strings of interest. The count for PIT and TIT is 2.

If not, would you expand the code to return the count of the common letters?

NumOPairs = 0 chA = Mid(A, i, 1) For j = 1 To Len(B) chB = Mid(B, j, 1) If chA = chB Then NumOPairs = NumOPairs + 1 'Remove this character from B string B = Left(B, j - 1) & Right(B, Len(B) - j) Exit For End If Next j Next i

Posted by Aladin Akyurek on June 28, 2001 9:55 AM

Perfect

Damon,

Just tried out your code. It produces a count of letters common to both strings. So forgot the question of the previous post. :)

I for one feel relieved. Thanks Damon.

Aladin NumOPairs = 0 chA = Mid(A, i, 1) For j = 1 To Len(B) chB = Mid(B, j, 1) If chA = chB Then NumOPairs = NumOPairs + 1 'Remove this character from B string B = Left(B, j - 1) & Right(B, Len(B) - j) Exit For End If Next j Next i

Posted by Damon Ostrander on June 28, 2001 2:25 PM

Re:comparing text string: Here's a VBA function that does it

No, not really the common letters. I think Carl (the original requestor) described it as matching up pairs of letters. When a pair is found, those letters are "crossed off" and cannot be counted again. Thus PIT and TIT should be 2 because after an I and T in each word is matched and crossed off, there are only a P and T left, which do not match. I believe this accomplished what Carl had in mind. I believe that one of the earlier posted answers to Carl's original posting provided a method for determining the count of common letters that doesn't resort to VBA code. It would be easy to re-write this to find all the common letters. In your example, though, the common letters in ROOF and ROTOR are only 2 as the only common letters are R and O, unless it is defined as all combinations, permutations, etc., of the common letters.

Damon

Posted by Aladin Akyurek on June 28, 2001 3:00 PM

Re:comparing text string: Here's a VBA function that does it

Yeah. You're right. Common should be understood as types and not as tokens. Mark's formula determines perfectly the number of types of letters the two strings share. I've been also looking at a single formula solution (for tokens) to no avail. Given the fact that (array) formulas cannot memorize by assignment, to produce a count of shared tokens is a hard case. It was instructive in that I now know something more about countif which does not seem to accept certain arrays as its 2nd arg.

I've got another case that has a certain similarity to Carl's, where I might need to call you in.

==================

Posted by Carl W on June 28, 2001 6:52 PM

Re:comparing text strings: Thanks a million Damon, Aladin and Mark W

Thanks a million. She's up and running and she's beautiful. Hope to see it as a function in the next version of excel!! Thanks to all (esp. Damon for cracking it) for all their efforts. The previous type formulas will also prove very helpful in our work. Thanks Mark.

Cheers,
Carl.

Yeah. You're right. Common should be understood as types and not as tokens. Mark's formula determines perfectly the number of types of letters the two strings share. I've been also looking at a single formula solution (for tokens) to no avail. Given the fact that (array) formulas cannot memorize by assignment, to produce a count of shared tokens is a hard case. It was instructive in that I now know something more about countif which does not seem to accept certain arrays as its 2nd arg. :