Need formula(s) to identify match between select words within text strings in 2 seperate cells

freeb1893

Board Regular
Joined
Jul 30, 2012
Messages
233
Office Version
  1. 365
Platform
  1. Windows
I'm trying to develop a formula where I can identify a match between text strings in separate cells, where the words inside the text string between the cells are not necessarily in the same order, and there may be other words in the text string that don't match.

There might be cases where the cells have the same name, but there are additional words in one of the two cells.

Here are some examples:

JOHN SMITH -----------------SMITH,JOHN A.
ARNOLD PALMER-------------PALMER JESSE ARNOLD
ALLEN STEVENS--------------STEVENS, JANE & ALLEN

Ideally, I would like a formula which identifies the 2 largest words in Cell A1, and then matches both those 2 largest words in Cell A1 to any of the words in Cell B1.

The words in Cell B1 may be separated by spaces or commas. (If we can only do a formula which looks at words separated only by spaces, I can remove the commas to make the formula work).

Any idea how to put this into a formula?
 
Last edited:

Excel Facts

How to create a cell-sized chart?
Tiny charts, called Sparklines, were added to Excel 2010. Look for Sparklines on the Insert tab.
This will return a match in the second column, but it assumes only two words in A1. Can you give some examples of what else might be in column A?

=INDEX($B$1:$B$3,MATCH(2,MMULT(ISNUMBER(SEARCH(CHOOSE({1,2},LEFT(A1,SEARCH(" ",A1)-1),RIGHT(A1,LEN(A1)-SEARCH(" ",A1))),$B$1:$B$3))*1,{1;1}),0))
 
Upvote 0
It could be more than 2 words, really as many as several. But it will always be the two largest words in Column A that I need to find/match in Column B. Column A might also have periods or commas at times within the string. It won't always be separated by spaces (but I can remove the commas/periods if its necessary to make the formula work).
This will return a match in the second column, but it assumes only two words in A1. Can you give some examples of what else might be in column A?

=INDEX($B$1:$B$3,MATCH(2,MMULT(ISNUMBER(SEARCH(CHOOSE({1,2},LEFT(A1,SEARCH(" ",A1)-1),RIGHT(A1,LEN(A1)-SEARCH(" ",A1))),$B$1:$B$3))*1,{1;1}),0))
 
Upvote 0
Hi.

Could you include perhaps half a dozen varied examples together with your expected results?

Regards
 
Upvote 0
Sure! Here are some examples of where a match should be identified by the formula:

KIEUNGAN DAO-------------------DAO,TUAN & KIEU N
YA QUIN ZHANG-------------------ZHANG,YA Q
VICKI MONTGOMERY-------------------MONTGOMERY,VICKI L
JOHN R MARSHALL-------------------MARSHALL JOHN R TRUST
LAURIE T RODERIQUES-------------------RODERIQUES,LAURIE T
STEVE BOLLAR-------------------BOLLAR,STEVE B & DEBRA A

Here are some examples of where a match should not be identified by the formula:

DEBORAH J GOEDEL-------------------ROBERTS,RAYMOND R & MAJORIE J
HUGH BENSON-------------------RAY,LISA
LUZ ELENA MONTES ALVARADO-------------------ALVAREZ LUZ M E
STANLEY COTTA-------------------DICKEY,BRUCE
ORANGE LLC-------------------CITIMORTGAGE INC




Hi.

Could you include perhaps half a dozen varied examples together with your expected results?

Regards
 
Upvote 0
Thanks, but I was hoping you could provide some examples in which your condition that the formula take into consideration the two largest words only is a factor.

Could you post a few such cases, e.g. in which taking two arbitrary words (as opposed to the two largest) might lead to undesired results?

Also, I thought we were matching against any of the words in column B? In all of your examples it seems rather fortunate that the matching word is in the exact same row.

Regards
 
Last edited:
Upvote 0
No I'm going to be matching to only the words in the adjacent cell in Column B, such as B1 to A1, B2 to A2, etc. The matching value will be in the exact same row.

Below would be some examples where if we matched just 2 arbitrary words as opposed to the 2 largest words, where the match would be undesired:

BRIAN T WOLTER-------------------WOLTER LOUIS T
SHANNON KAY WARREN-------------------WARREN,DENISE KAY
J KAREN OLIVIER-------------------OLIVIER,AUDREY J



Thanks, but I was hoping you could provide some examples in which your condition that the formula take into consideration the two largest words only is a factor.

Could you post a few such cases, e.g. in which taking two arbitrary words (as opposed to the two largest) might lead to undesired results?

Also, I thought we were matching against any of the words in column B? In all of your examples it seems rather fortunate that the matching word is in the exact same row.

Regards
 
Upvote 0
Sure! Here are some examples of where a match should be identified by the formula:

KIEUNGAN DAO-------------------DAO,TUAN & KIEU N

Thanks a lot. Much appreciated.

Can you just clarify the above example? This one is significantly different from the others in that, if words contained within others ("KIEU" within "KIEUNGAN") are to be considered, things are necessarily a lot more complicated.

Regards
 
Upvote 0
Ahh yes... that one is a tough one... I forgot about that one. I would love to be able to catch that example by identifying words within words if its possible. But if a formula with that kind of ability proves too difficult to develop, having a formula that would catch the rest of those except that scenario would be an alternative that is better than nothing for me and my purposes.

Even a formula where instead of matching the full 2 largest words in the first text string, I match the first 4 characters of the 2 largest words in the first text string (Cell A1), to the first 4 characters to any words in the second text string (Cell B1), that might work for my purposes as well.
Thanks a lot. Much appreciated.

Can you just clarify the above example? This one is significantly different from the others in that, if words contained within others ("KIEU" within "KIEUNGAN") are to be considered, things are necessarily a lot more complicated.

Regards
 
Upvote 0
I would love to be able to catch that example by identifying words within words if its possible.

I'm not sure how this is compatible with your other condition that only the two largest "words" be considered. After all, "KIEUNGAN" is not even a "word" within "KIEU".

That aside, try this array formula**:

=IF(COUNT(SEARCH(" "&INDEX(TRIM(MID(SUBSTITUTE(A1," ",REPT(" ",LEN(A1))),1+LEN(A1)*(ROW(INDEX(A:A,1):INDEX(A:A,1+LEN(A1)-LEN(SUBSTITUTE(A1," ",""))))-1),LEN(A1))),N(IF(1,MATCH(LARGE(LEN(TRIM(MID(SUBSTITUTE(A1," ",REPT(" ",LEN(A1))),1+LEN(A1)*(ROW(INDEX(A:A,1):INDEX(A:A,1+LEN(A1)-LEN(SUBSTITUTE(A1," ",""))))-1),LEN(A1)))),{1,2}),LEN(TRIM(MID(SUBSTITUTE(A1," ",REPT(" ",LEN(A1))),1+LEN(A1)*(ROW(INDEX(A:A,1):INDEX(A:A,1+LEN(A1)-LEN(SUBSTITUTE(A1," ",""))))-1),LEN(A1)))),0))))&" "," "&SUBSTITUTE(B1,","," ")&" "))=2,"Match","No Match")

though I don't agree with your expected result for "YA QUIN ZHANG" and "ZHANG,YA Q" (the two largest words in the former do not appear in the latter).

Regards


**Array formulas are not entered in the same way as 'standard' formulas. Instead of pressing just ENTER, you first hold down CTRL and SHIFT, and only then press ENTER. If you've done it correctly, you'll notice Excel puts curly brackets {} around the formula (though do not attempt to manually insert these yourself).
 
Last edited:
Upvote 0

Forum statistics

Threads
1,215,745
Messages
6,126,634
Members
449,324
Latest member
AmirMalik

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top