Extract multiple keywords from text string

HughT

Board Regular
Joined
Jan 6, 2012
Messages
113
Office Version
  1. 365
Platform
  1. Windows
I have searched the forum but I can't find anything that quite matches this.

This is for a healthy eating project. Students record their meals in a daily diary. This is random unformatted text in a single cell (B1, B2 etc). There is a keyword list of healthy foods, each item being in a separate cell. What I would like to do is search the random text for occurrences of the keywords and return the keywords in another cell adjacent to the text cell. I would then like to be able to search the returned cells by the keyword list.

So:

Keywords (each in a separate cell, but doesn't have to be in Column A):

A1 Apple
A2 Fries
A3 Salad
A4 Burger
etc

Text (in B1)
Today I ate a burger with fries, and had an apple afterwards.

Result (in C1)
Apple Fries Burger [order is not important]

C1 to C20 (etc) will be the searchable data. I want to be able to search this by each keyword in the range A1:A4, ie 'Apple', 'Fries', 'Salad' etc so I can see who has been eating Apples, Fries, etc. Using column filters will display the contents of every cell, so if some comedian enters the whole range A1:A4 (which will actually be much larger) the filter will also return the whole range, so I need an alternative method.

I possible I would like to do this by a formula rather than VBA as I have to hand this over to someone who will not understand VBA, and can add to or alter the contents of the lookup range (A1:A4) simply by adding to it or overtyping the existing contents.

Thank you for your help.
 

Excel Facts

Bring active cell back into view
Start at A1 and select to A9999 while writing a formula, you can't see A1 anymore. Press Ctrl+Backspace to bring active cell into view.
Aladin
That no longer returns "fries" from the sample string.

Do we have an exhaustive list of keywords and punctuation marks that we can come across in the target string? Case by case evaluations are not very helpful regarding this class of problems.
 
Upvote 0
Gentlemen, many thanks for your help with this. I am in awe!

If it helps, the keywords will populate column A (in this case A2:A21, because row 1 is the column headers) and will simply be text strings without punctuation, eg apple burger etc. It is possible that the text string could be two words eg red grapes etc but this is not a deal breaker if it presents problems. I have made the column headers equal the keyword cells in A2:A21 so that D1=A2, E1=A3 etc, ie if the keyword is changed the column header for the lookup will change to the identical text.

Many thanks again
 
Upvote 0
Gentlemen, many thanks for your help with this. I am in awe!

If it helps, the keywords will populate column A (in this case A2:A21, because row 1 is the column headers) and will simply be text strings without punctuation, eg apple burger etc. It is possible that the text string could be two words eg red grapes etc but this is not a deal breaker if it presents problems. I have made the column headers equal the keyword cells in A2:A21 so that D1=A2, E1=A3 etc, ie if the keyword is changed the column header for the lookup will change to the identical text.

Many thanks again

Care to post these keywords and some of the difficult target strings?
 
Upvote 0
If it helps, the keywords will populate column A ... and will simply be text strings without punctuation,..
But your users are entering text in column B and they may use punctuation, which mostly causes problems with this sort of word-matching.

I would feel more comfortable if you responded about the apple/pineapple type problem too. So far you seem to have ignored it but it is quite possible to a a healthy word turn up in a very unhealthy one (eg nuts / doughnuts) and the formula solutions that you seem to be accepting as doing what you want will return the row containing the unhealthy food. :eek:


If the above can be solved then a list of rows of interest can be generated by formula.
In the example below (still showing the apple/pineapple issue) I have a smaller data set so have just hidden 4 columns. In cell J1 you could set up a Data Validation to choose a value from A1:A?? (or from D1:??1).
Then either of the K2 or L2 formulas copied down as far as you might ever need.
The K2 formula requires Excel 2010 or later but is a standard-entry formula.
The L2 formula will work in all versions but is an array formula. It should be entered without the {} but confirmed with Ctrl+Shift+Enter, not just Enter. If confirmed correctly, Excel will insert the {}. The formula can then be copied down.
For either K2 or L2 formula, the $1000 needs to be larger than the number of rows that will ever be used in columns B, C, D, ....

Excel Workbook
ABCHIJKL
1Salad
2AppleToday I ate a burger with fries, and had an apple afterwards.Apple Fries BurgerSalad, pineapple & hamSalad, pineapple & ham
3FriesToday I had eggsTuna salad & friesTuna salad & fries
4SaladapplesApple
5BurgerSalad, pineapple & hamApple Salad
6Tuna salad & friesFries Salad
7Junk food
8I ate 4 applesApple
9
10
Keywords
 
Last edited:
Upvote 0
Seems like this thread has died but I was just dealing with a similar issue in another thread & I think this is a way to deal with both the apple/pineapple and the punctuation issues. It does not address plurals.

Excel Workbook
ABCDEF
1AppleFriesSaladBurger
2Today I ate a burger with fries, and had an apple afterwards.AppleFriesBurger
3Today I had eggs
4Salad, pineapple & hamburgerSalad
5I ate 4 apples
6I ate an apple!Apple
Keywords
 
Upvote 0
Seems like this thread has died but I was just dealing with a similar issue in another thread & I think this is a way to deal with both the apple/pineapple and the punctuation issues. It does not address plurals.

Keywords

ABCDEF
1 AppleFriesSaladBurger
2Today I ate a burger with fries, and had an apple afterwards. AppleFries Burger
3Today I had eggs
4Salad, pineapple & hamburger Salad
5I ate 4 apples
6I ate an apple! Apple

<colgroup><col style="font-weight:bold; width:30px; "><col style="width:363px;"><col style="width:56px;"><col style="width:61px;"><col style="width:56px;"><col style="width:56px;"><col style="width:56px;"></colgroup><tbody>
</tbody>

Spreadsheet Formulas
CellFormula
C2=IF(IFERROR(AND(ABS(77.5-CODE(MID(UPPER(1&$A2),SEARCH(C$1,$A2),1)))>13,ABS(77.5-CODE(MID(UPPER($A2&1),SEARCH(C$1,$A2)+LEN(C$1),1)))>13),0),C$1,"")

<tbody>
</tbody>

<tbody>
</tbody>


Excel tables to the web >> Excel Jeanie HTML 4

It is a very good idea to identify punctuation marks at the end of the search word by converting the whole searched string to Uppercase and see if the end of the potential match changes or not.

To further develop this method, consider the sentences below where apple and egg is not extracted:

I like pineapple, but not apple.
I like applesauce, but not apple itself.
I gave the beggar an egg.
 
Upvote 0
To further develop this method, consider the sentences below where apple and egg is not extracted:

I like pineapple, but not apple.
I like applesauce, but not apple itself.
I gave the beggar an egg.
István

Thanks for pointing out the (now obvious) flaw in my latest suggestion. For a moment I thought that I was onto something with that. :(

Oh well, back to the drawing board - or the vba.
 
Upvote 0
Dear all

Very many thanks for your continued help, and apologies for my absence. I have been out of the office for most of the past two weeks and unable to catch up. I will look at your kind responses in the next few days and try some suggestions out.

Hugh
 
Upvote 0

Forum statistics

Threads
1,213,527
Messages
6,114,142
Members
448,551
Latest member
Sienna de Souza

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top