How to extract a particular word(s) from a cell in excel that matches a list of values

ankitagarwal88

New Member
Joined
Dec 9, 2016
Messages
4
Hello everyone,
I have different text strings in each cell of coloumn A and wish to extract only certain word(s) from it that are present in a list of items I have noted down in coloumn B. Here is a brief example of a sample text string and the list of items:
Column A:
"Agricultural activities in Vietnam generate about 62milliontonnes of biomass (rice straw, rice husk, bagasse, corn cob, corn stover, etc.) annually. In this work, four different types of biomass from Vietnam, namely rice straw, rice husk, factory bagasse, and corn cob, have been studied as potential raw materials to produce bio-oil by fast pyrolysis technology. Test runs were conducted in a fluidized-bed reactor at a temperature of 500°C and residence time less than 2s. Size and moisture content of the feed were less than 2mm and 2%, respectively. It was found that yields of bio-oil as a liquid product obtained from pyrolysis of these feedstocks were more than 50% and that obtained from the bagasse was the highest. Bio-oil quality from Vietnamese biomass resources satisfies ASTM D7544-12 standard for pyrolysis liquid biofuels. These results showed the potential of using biomass in Vietnam to produce bio-oil which could be directly used as a combustion fuel or upgraded into transportation fuels and chemicals"

Column B:
List of items:
straw
stover
leaves
shell
peel
cob
piassava
stalk
root
tuber

I have used the following formula that works, but it extracts only the first matching in the text string namely 'straw'. But I wish to also obtain any other names that are present in the list of items and display them comma seperated.
=IFERROR(INDEX($B$1:$B$86,MATCH(1,COUNTIF($A1,"*"&$B$1:$B$86&"*"),0)),"")
Looking forward to a response.
Thanks in advance,
Ankit
 

Excel Facts

Difference between two dates
Secret function! Use =DATEDIF(A2,B2,"Y")&" years"&=DATEDIF(A2,B2,"YM")&" months"&=DATEDIF(A2,B2,"MD")&" days"
Try this UDF for your requirement:

Code:
Function Ankita(ByVal SearchText As String, ByVal WordstoFind As Range) As String
Dim word    As Range
Dim text    As Variant
Dim words   As Variant
Dim txt     As Variant
Dim output  As String
Dim dict    As Object

Set dict = CreateObject("scripting.dictionary")
text = Split(Trim(SearchText))

For Each word In WordstoFind
    For Each txt In text
        If Right(txt, 1) Like "[,.?!;]" Then txt = Left(txt, Len(txt) - 1)
            If StrComp(txt, word, vbTextCompare) = 0 Then
                dict.Item(word) = vbEmpty
            End If
    Next
Next
Ankita = Join(dict.keys(), ", ")
End Function

You can use it as shown below:

afbad3f3ae.png
 
Upvote 0
Dear Ombir,
Thank you very much for your response. At the same time as you posted this reply, I also got another solution for this question which I tested and worked very well:

"Public Function Extract_word(Str As String, List As Range) As String
Dim Cll As Range, Tmp As String
For Each Cll In List
If InStr(LCase(Str), LCase(Cll.Value)) > 0 Then
Tmp = Tmp & ", " & Cll.Value
End If
Next Cll
If Len(Tmp) > 0 Then Extract_word = Mid(Tmp, 3)


End Function "

It seems there are more than one way to skin the cat!! Thank you once again for this solution!

Ankit





Try this UDF for your requirement:

Code:
Function Ankita(ByVal SearchText As String, ByVal WordstoFind As Range) As String
Dim word    As Range
Dim text    As Variant
Dim words   As Variant
Dim txt     As Variant
Dim output  As String
Dim dict    As Object

Set dict = CreateObject("scripting.dictionary")
text = Split(Trim(SearchText))

For Each word In WordstoFind
    For Each txt In text
        If Right(txt, 1) Like "[,.?!;]" Then txt = Left(txt, Len(txt) - 1)
            If StrComp(txt, word, vbTextCompare) = 0 Then
                dict.Item(word) = vbEmpty
            End If
    Next
Next
Ankita = Join(dict.keys(), ", ")
End Function

You can use it as shown below:

afbad3f3ae.png
 
Upvote 0
The solution you've got has a drawback.

This function uses instr to find a string inside another string. It doesn't find exact match. For ex: If you add Stove to your list, it will return stove as found because instr found stove inside stover. Same goes for mass and biomass.
 
Upvote 0
The solution you've got has a drawback.

This function uses instr to find a string inside another string. It doesn't find exact match. For ex: If you add Stove to your list, it will return stove as found because instr found stove inside stover. Same goes for mass and biomass.

Hi Ombir,
That is true and I have found it to be a problem in the case of other categories to which I applied the code I mentioned.
I have used your code and it clears this problem, but it doesnt recognise a search term with two or more words in it : for example 'rice husk' or 'corn cob' .
Any modifications to the code you provided that can help clear this issue?
 
Upvote 0
Try this. I've used Instrexact function developed by Rick Rothstein long time ago for double words searching.

Use it like as you were using before =Ankita(SearchString,WordstoFind)

Code:
Function Ankita(ByVal SearchText As String, ByVal WordstoFind As Range) As String
Dim word    As Range
Dim text    As String
Dim output  As String

text = Trim(SearchText)

For Each word In WordstoFind
    If InStrExact(text, word.Value) > 0 Then output = output & ", " & word
        Next
Ankita = Mid(output, 2)
End Function
Function InStrExact(SourceText As String, WordToFind As String) As Long
  Dim x As Long, Str1 As String, Str2 As String, Pattern As String
    Str1 = UCase(SourceText)
    Str2 = UCase(WordToFind)
    Pattern = "[!A-Z0-9]"
  For x = 1 To Len(Str1) - Len(Str2) + 1
    If Mid(" " & Str1 & " ", x, Len(Str2) + 2) Like Pattern & Str2 & Pattern _
       And Not Mid(Str1, x) Like Str2 & "'[" & Mid(Pattern, 3) & "*" Then
      InStrExact = x
      Exit Function
    End If
  Next
End Function
 
Upvote 0
Thank you Ombir- that extra bit of code worked very well- thank you very much!!
Ankit

Try this. I've used Instrexact function developed by Rick Rothstein long time ago for double words searching.

Use it like as you were using before =Ankita(SearchString,WordstoFind)

Code:
Function Ankita(ByVal SearchText As String, ByVal WordstoFind As Range) As String
Dim word    As Range
Dim text    As String
Dim output  As String

text = Trim(SearchText)

For Each word In WordstoFind
    If InStrExact(text, word.Value) > 0 Then output = output & ", " & word
        Next
Ankita = Mid(output, 2)
End Function
Function InStrExact(SourceText As String, WordToFind As String) As Long
  Dim x As Long, Str1 As String, Str2 As String, Pattern As String
    Str1 = UCase(SourceText)
    Str2 = UCase(WordToFind)
    Pattern = "[!A-Z0-9]"
  For x = 1 To Len(Str1) - Len(Str2) + 1
    If Mid(" " & Str1 & " ", x, Len(Str2) + 2) Like Pattern & Str2 & Pattern _
       And Not Mid(Str1, x) Like Str2 & "'[" & Mid(Pattern, 3) & "*" Then
      InStrExact = x
      Exit Function
    End If
  Next
End Function
 
Upvote 0

Forum statistics

Threads
1,214,819
Messages
6,121,746
Members
449,050
Latest member
excelknuckles

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top