Page 1 of 10 123 ... LastLast
Results 1 to 10 of 95

Thread: Extract multiple keywords from text string
Thanks Thanks: 0 Likes Likes: 0

  1. #1
    Board Regular
    Join Date
    Jan 2012
    Posts
    76
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Extract multiple keywords from text string

    I have searched the forum but I can't find anything that quite matches this.

    This is for a healthy eating project. Students record their meals in a daily diary. This is random unformatted text in a single cell (B1, B2 etc). There is a keyword list of healthy foods, each item being in a separate cell. What I would like to do is search the random text for occurrences of the keywords and return the keywords in another cell adjacent to the text cell. I would then like to be able to search the returned cells by the keyword list.

    So:

    Keywords (each in a separate cell, but doesn't have to be in Column A):

    A1 Apple
    A2 Fries
    A3 Salad
    A4 Burger
    etc

    Text (in B1)
    Today I ate a burger with fries, and had an apple afterwards.

    Result (in C1)
    Apple Fries Burger [order is not important]

    C1 to C20 (etc) will be the searchable data. I want to be able to search this by each keyword in the range A1:A4, ie 'Apple', 'Fries', 'Salad' etc so I can see who has been eating Apples, Fries, etc. Using column filters will display the contents of every cell, so if some comedian enters the whole range A1:A4 (which will actually be much larger) the filter will also return the whole range, so I need an alternative method.

    I possible I would like to do this by a formula rather than VBA as I have to hand this over to someone who will not understand VBA, and can add to or alter the contents of the lookup range (A1:A4) simply by adding to it or overtyping the existing contents.

    Thank you for your help.

  2. #2
    Board Regular
    Join Date
    Dec 2002
    Posts
    1,931
    Post Thanks / Like
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Extract multiple keywords from text string

    Try,

    In C1 enter:

    =TRIM(IF(ISNUMBER(SEARCH(A1,B1)),A1,"")&" "&IF(ISNUMBER(SEARCH(A2,B1)),A2,"")&" "&IF(ISNUMBER(SEARCH(A3,B1)),A3,"")&" "&IF(ISNUMBER(SEARCH(A4,B1)),A4,""))

    Regards

  3. #3
    Board Regular
    Join Date
    Jan 2012
    Posts
    76
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Extract multiple keywords from text string

    Thank you very much for the quick reply!

    I tried copying this down column C (ie to C2, C3, C4 etc) and it didn't work, so I fixed the references after the double brackets with $ signs eg (A1,B1)),$A$1,"") etc, and this didn't work either.

    The keyword column could have 20 or more entries, how do I alter the formula to accommodate the range? When I tried to do this myself, I was wondering if a range statement eg '$A$1:$A$20' should be in there somewhere to capture the keyword criteria.

  4. #4
    Board Regular
    Join Date
    Dec 2002
    Posts
    1,931
    Post Thanks / Like
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Extract multiple keywords from text string

    Try this to copy down,

    =TRIM(IF(ISNUMBER(SEARCH($A$1,B1)),$A$1,"")&" "&IF(ISNUMBER(SEARCH($A$2,B1)),$A$2,"")&" "&IF(ISNUMBER(SEARCH($A$3,B1)),$A$3,"")&" "&IF(ISNUMBER(SEARCH($A$4,B1)),$A$4,""))

    You can nest the formula by 20s in the way of : .....&" "&IF(ISNUMBER(SEARCH($A$5,B1)),$A$5,"").....until 20.....&" "&IF(ISNUMBER(SEARCH($A$20,B1)),$A$20,""))

    Regards

  5. #5
    MrExcel MVP Aladin Akyurek's Avatar
    Join Date
    Feb 2002
    Location
    The Hague, NL
    Posts
    85,117
    Post Thanks / Like
    Mentioned
    44 Post(s)
    Tagged
    9 Thread(s)

    Default Re: Extract multiple keywords from text string

    Row\Col
    A
    B
    C
    1
    Apple Today I ate a burger with fries, and had an apple afterwards. Apple, Fries, Burger
    2
    Fries
    3
    Salad
    4
    Burger
    5


    In C1 control+shift+enter, not just enter:

    =REPLACE(aconcat(IF(ISNUMBER(SEARCH($A$1:$A$4,B1)),", "&$A$1:$A$4,""),""),1,2,"")

    For this formula to work, you need to add the VBA code for ACONCAT to your workbook as a module, using Alt+F11...

    Function aconcat(a As Variant, Optional sep As String = "") As String
    ' Harlan Grove, Mar 2002
    Dim y As Variant
    If TypeOf a Is Range Then
    For Each y In a.Cells
    aconcat = aconcat & y.Value & sep
    Next y
    ElseIf IsArray(a) Then
    For Each y In a
    aconcat = aconcat & y & sep
    Next y
    Else
    aconcat = aconcat & a & sep
    End If
    aconcat = Left(aconcat, Len(aconcat) - Len(sep))
    End Function
    Assuming too much and qualifying too much are two faces of the same problem.

  6. #6
    Board Regular
    Join Date
    Jan 2012
    Posts
    76
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Extract multiple keywords from text string

    Thank you both, but I think I may have overcomplicated the question. The basic criteria is that this must be capable of being maintained by someone who has little or not Excel knowledge, other than entering text in a cell (incredibly, such people do exist!)

    So thinking around the problem, what I want to do is:

    1. Create a list of keywords in A1:A20.
    2. Enter unformatted text in B1, B2, to B100 (for the sake of this example)
    3. Return results of match of keywords in A1:A20 in C1, C2, etc to C100.

    So, spreading this out:

    1. Check the text string in B1 for the ONE text string in A1, and return this in D1.
    2. Check the text string in B1 for the ONE text string in A2, and return this in E1, and continue so that D1, E1, F1 etc are the matches against A1:A20 in B1 (ie column D is headed with contents of A1, E with contents of A2, F = A3 etc).
    3. Repeat for entire range A1:A20 against B2 to B100.
    4. Concatenate results of D1, E1, F1, etc in C1 (ie next to the unformatted text) to show all the matches of A1:A20 in B1. This solves the 'how do I extract the keywords in A1:A20 into C1' qustion. The next bit is harder:
    5. Enable search / match / filter function of some sort across range D1:W100 (ie 20 columns corresponding to A1:A20) by the number of rows of data entered (B1 to B100) so that if the word 'Apple' is entered as search criteria, it filters the rows to show all the instances of Apple in B1 to B100. (This is the tricky bit).


    Sorry if this confuses things, but I have got lots of empty columns to play with!

    Many thanks

    Hugh

  7. #7
    MrExcel MVP
    Moderator
    Peter_SSs's Avatar
    Join Date
    May 2005
    Location
    Macksville, Australia
    Posts
    40,976
    Post Thanks / Like
    Mentioned
    90 Post(s)
    Tagged
    21 Thread(s)

    Default Re: Extract multiple keywords from text string

    Edit: I hadn't seen your most recent post before submitting mine. I'll review mine once I've had a chance to study your last post.

    I think you would be starting to see from bosco_yip's suggestion that dealing with, say, 20 keywords a standard formula would also be pretty horrendous to leave to somebody else later. In fact that formula would have to get even longer to work properly. With the keyword sample list given & text "Today I ate a burger with fries, and had a pineapple afterwards." the formula returns the keyword "Apple", even though none was eaten.

    In fact that issue is the same with Aladin's formula at the moment so it also needs a tweak.

    My suggestion below, which also uses vba, has something of the opposite problem - refer row 5 of my screen shot where mine did not return Apple because that 'word' was not found in the text.

    There will also be the problem of other plurals where the initial ending changes. for example you may have "Cherry" in your list, so you would probably also need "Cherries"

    All a minefield really.

    Anyway, this is my attempt, also using a user-defined function. To implement ..

    1. Right click the sheet name tab and choose "View Code".

    2. In the Visual Basic window use the menu to Insert|Module

    3. Copy and Paste the code below into the main right hand pane that opens at step 2.

    4. Close the Visual Basic window.

    5. Enter the formula as shown in cell C1 in the screen shot below and copy down.

    6. If using Excel 2007 or later your workbook will need to be saved as a macro-enabled workbook (*.xlsm)

    Code:
    Function Keywords(s As String, rWords As Range) As String
      Dim kw As Object
      Dim i As Long
      
      With CreateObject("VBScript.RegExp")
        .IgnoreCase = True
        .Global = True
        .Pattern = "\b(" & Join(Application.Transpose(rWords), "|") & ")\b"
        If .test(s) Then
          Set kw = .Execute(s)
          For i = 1 To kw.Count
            Keywords = Keywords & ", " & kw(i - 1)
          Next i
        End If
      End With
      Keywords = Mid(Keywords, 3)
    End Function
    For comparison of where all the suggestions may have issues, I've included Aladin's and bosco's suggestions in columns D & E respectively.

    Sheet1

     ABCDE
    1AppleToday I ate a burger with fries, and had an apple afterwards.burger, fries, appleApple, Fries, BurgerApple Fries Burger
    2FriesToday I had eggs   
    3Saladapples AppleApple
    4BurgerSalad, pineapple & hamSaladApple, SaladApple Salad
    5 I ate 4 apples AppleApple
    6     

    Spreadsheet Formulas
    CellFormula
    C1=Keywords(B1,$A$1:$A$4)
    D1{=REPLACE(aconcat(IF(ISNUMBER(SEARCH($A$1:$A$4,B1)),", "&$A$1:$A$4,""),""),1,2,"")}
    E1=TRIM(IF(ISNUMBER(SEARCH($A$1,B1)),$A$1,"")&" "&IF(ISNUMBER(SEARCH($A$2,B1)),$A$2,"")&" "&IF(ISNUMBER(SEARCH($A$3,B1)),$A$3,"")&" "&IF(ISNUMBER(SEARCH($A$4,B1)),$A$4,""))
    Formula Array:
    Produce enclosing
    { } by entering
    formula with CTRL+SHIFT+ENTER!


    Excel tables to the web >> Excel Jeanie HTML 4
    Hope this helps, good luck.
    Peter
    Excel 365 - Windows 10
    - Want to help your helpers by posting a small, copyable, screen shot directly in your post? Look here
    - If posting vba code, please use Code Tags - like this [code]Paste code here[/code] - or use the # key in the Reply window
    - Read: Forum Rules, Forum Use Guidelines, & FAQ

  8. #8
    Board Regular
    Join Date
    Dec 2002
    Posts
    1,931
    Post Thanks / Like
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Extract multiple keywords from text string

    Quote Originally Posted by HughT View Post
    Thank you both, but I think I may have overcomplicated the question. The basic criteria is that this must be capable of being maintained by someone who has little or not Excel knowledge, other than entering text in a cell (incredibly, such people do exist!)

    So thinking around the problem, what I want to do is:

    1. Create a list of keywords in A1:A20.
    2. Enter unformatted text in B1, B2, to B100 (for the sake of this example)
    3. Return results of match of keywords in A1:A20 in C1, C2, etc to C100.

    So, spreading this out:

    1. Check the text string in B1 for the ONE text string in A1, and return this in D1.
    2. Check the text string in B1 for the ONE text string in A2, and return this in E1, and continue so that D1, E1, F1 etc are the matches against A1:A20 in B1 (ie column D is headed with contents of A1, E with contents of A2, F = A3 etc).
    3. Repeat for entire range A1:A20 against B2 to B100.
    4. Concatenate results of D1, E1, F1, etc in C1 (ie next to the unformatted text) to show all the matches of A1:A20 in B1. This solves the 'how do I extract the keywords in A1:A20 into C1' qustion. The next bit is harder:
    5. Enable search / match / filter function of some sort across range D1:W100 (ie 20 columns corresponding to A1:A20) by the number of rows of data entered (B1 to B100) so that if the word 'Apple' is entered as search criteria, it filters the rows to show all the instances of Apple in B1 to B100. (This is the tricky bit).


    Sorry if this confuses things, but I have got lots of empty columns to play with!

    Many thanks

    Hugh
    In D1, enter formula, copy across to W1 and all copy down to 100 rows :

    =IF(ISNUMBER(SEARCH(INDEX($A$1:$A$20,COLUMNS($A1:A1)),$B1)),INDEX($A$1:$A$20,COLUMNS($A1:A1)),"")

    In C1, enter formula and copy down :

    =TRIM(CONCATENATE(D1," ",E1," ",F1," ",G1," ",H1," ",I1," ",J1," ",K1," ",L1," ",M1," ",N1," ",O1," ",P1," ",Q1," ",R1," ",S1," ",T1," ",U1," ",V1," ",W1))

    Regards

  9. #9
    Board Regular
    Join Date
    Jan 2012
    Posts
    76
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Extract multiple keywords from text string

    Brilliant!

    Thank you very much. It all works beautifully. I now have a 20 column x 100 row table each column headed (for convenience) with the keyword from the index range, and beneath that a cell containing the same word wherever that word is also contained in the freetext box, all concatenated into one cell. But how do I perform a search across this data, for example to find all occurrences of the word 'Burger' when a burger always has another item concatenated with it such as fries, salad etc?

    Very many thanks.

  10. #10
    MrExcel MVP
    Moderator
    Peter_SSs's Avatar
    Join Date
    May 2005
    Location
    Macksville, Australia
    Posts
    40,976
    Post Thanks / Like
    Mentioned
    90 Post(s)
    Tagged
    21 Thread(s)

    Default Re: Extract multiple keywords from text string

    Quote Originally Posted by HughT View Post
    I now have a 20 column x 100 row table each column headed (for convenience) with the keyword from the index range, ...
    In that case, you can use them (headers) for a much simpler formula in your columns.

    Note that you still have the issue of the apple showing up when pineapple is in the text & apple is not, but you seem happy to accept that. (same issue could occur with "ham" and "hamburger" etc)

    Formula in D2 is copied across and down.

    Keywords

     BCDEFG
    1  AppleFriesSaladBurger
    2Today I ate a burger with fries, and had an apple afterwards. AppleFries Burger
    3Today I had eggs     
    4apples Apple   
    5Salad, pineapple & ham Apple Salad 
    6I ate 4 apples Apple   

    Spreadsheet Formulas
    CellFormula
    D2=IF(SEARCH(D$1,$B2&D$1)<LEN($B2),D$1,"")


    Excel tables to the web >> Excel Jeanie HTML 4
    Hope this helps, good luck.
    Peter
    Excel 365 - Windows 10
    - Want to help your helpers by posting a small, copyable, screen shot directly in your post? Look here
    - If posting vba code, please use Code Tags - like this [code]Paste code here[/code] - or use the # key in the Reply window
    - Read: Forum Rules, Forum Use Guidelines, & FAQ

Some videos you may like

User Tag List

Tags for this Thread

Like this thread? Share it with others

Like this thread? Share it with others

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •