Finding words within words in a very long list!

NeilMcCauley

New Member
Joined
Aug 5, 2012
Messages
8
Imagine A1:A300000 contains a long list of words, like this:

Column A
1Applesauce22
2Bananas
3Antiapple
4Applesandpears

<tbody>
</tbody>

And I have a list of words I'd like to find within the above list, like this:

Column B
1apple
2pear

<tbody>
</tbody>

The list in column B might be a couple of thousand words long. In other words, I need to see if a couple of thousand words (B1:B2000) appear within about 300,000 cells (A1:A300000).

Right now, the closest I have come is using this formula: =LOOKUP(2^15,SEARCH(B$1:B$2000,A1),B$1:B$2000). However, it only returns one match (it would find 'Apple' in A4, but not notice that A4 also contained 'Pear'). The other key facts about this challenge are that words in column A aren't delimited in any way, they are hidden within single strings of text. That seems to be fine with the formula I just described, but maybe it would affect alternative solutions.

I'm guessing that Excel is not actually the best way to do this kind of task, because of the number of computations needed. However, I'm not technical enough to use Python or R or something. I was hoping somebody might be able to suggest a formula or module that could handle this tricky task. Perhaps the end result for the above example would be something like:

Column A (Long list)Column B (First word found)Column C (second word found)Lookup words (Words I'm looking for)
1Applesauce22AppleApple
2BananasPear
3AntiappleApple
4ApplesandpearsApplePear

<tbody>
</tbody>

Many thanks!

P.S. I'm using Excel 2013, on Windows 64-bit. :)
 

Excel Facts

Fastest way to copy a worksheet?
Hold down the Ctrl key while dragging tab for Sheet1 to the right. Excel will make a copy of the worksheet.
They could be anywhere. Imagine the words I'm looking through are passwords and I want to find any that contain the word 'apple'. I need to find and mark ones that are like 'Iloveapples32' or 'Apple' or 'Apple123'. So yeah, there is no set format. And if the password/phrase contains more than one word from my lookup table, it should be marked with each word found (if possible!). :)
 
Upvote 0
One thing I noticed with this formula =LOOKUP(2^15,SEARCH(B$2:B$10000,A4),B$2:B$10000) [which does the job but only finds the first match], is that when you expand the range to include 10,000 cells as opposed to just a few, it outputs ?????? instead of the usual #N/A or the matched word. Perhaps this type of formula is not equipped to do this kind of search across such a massive range?
 
Upvote 0

Forum statistics

Threads
1,215,343
Messages
6,124,402
Members
449,156
Latest member
LSchleppi

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top