Count number of keywords (range) found in a text string

liiamra

Board Regular
Joined
Mar 31, 2013
Messages
50
Dear All,

I am trying to count the number of specific words contained in a specific cell. If my data were static, and the list of keywords was short, there are many solutions such as using multiple instances of the length function as proposed by @shg here.

@njimack has proposed a very easy and helpful solution here, but it serves for listing the words and not counting them.

Suppose I have the following sheet: (column C is what I am trying to achieve)

ABC
1KeywordsCommentsCount
2happyLife is beautiful, you just need to stay happy and calm. staying happy is important.3
3stayno matter what you do, don't get upset!1
4smilesmile in the morning, smile in the evening, smile always3
5beautifulhello world, how is everyone doing? I am happy2
6upsetdon't be upset, cheer up.1
7hellosmile smile, smile. smile, smile5

<tbody>
</tbody>

That is, C2 will tell that B2 contained 3 of the keywords (1X beautiful, 2X happy) and so on.

Any recommendations are highly appreciated.

Thanks a lot
 

Excel Facts

Copy a format multiple times
Select a formatted range. Double-click the Format Painter (left side of Home tab). You can paste formatting multiple times. Esc to stop
Code:
Function wordcount(ByVal strWord As String, ByVal rngStrings As Range) As Double

Dim strTextString As String
Dim wordcounting() As String

Application.Volatile

For Each icell In rngStrings
    strTextString = strTextString & ";" & icell.Value
Next icell



wordcounting = Split(StrConv(strTextString, vbUpperCase), StrConv(strWord, vbUpperCase))

wordcount = UBound(wordcounting)

If wordcount = -1 Then wordcount = 0

End Function

reference cell would be =wordcount(a2,$B$2:$B$7) and so on
 
Upvote 0
Dear All,

I am trying to count the number of specific words contained in a specific cell. If my data were static, and the list of keywords was short, there are many solutions such as using multiple instances of the length function as proposed by @shg here.

@njimack has proposed a very easy and helpful solution here, but it serves for listing the words and not counting them.

Suppose I have the following sheet: (column C is what I am trying to achieve)

ABC
1KeywordsCommentsCount
2happyLife is beautiful, you just need to stay happy and calm. staying happy is important.3
3stayno matter what you do, don't get upset!1
4smilesmile in the morning, smile in the evening, smile always3
5beautifulhello world, how is everyone doing? I am happy2
6upsetdon't be upset, cheer up.1
7hellosmile smile, smile. smile, smile5

<tbody>
</tbody>

That is, C2 will tell that B2 contained 3 of the keywords (1X beautiful, 2X happy) and so on.

Any recommendations are highly appreciated.

Thanks a lot

Generally speaking, such tasks should be done completely in VBA if at all.

Add the following generic functions in VBA to your workbook, using Alt+F11...

Public Function EVAL(theInput As Variant) As Variant
'
' if UDF evaluate the input string as though it was on this sheet
' else evaluate for activesheet
'
Dim vEval As Variant
Application.Volatile
On Error GoTo funcfail
If Not IsEmpty(theInput) Then
If TypeOf Application.Caller.Parent Is Worksheet Then
vEval = Application.Caller.Parent.Evaluate(CStr(theInput))
Else
vEval = Application.Evaluate(CStr(theInput))
End If
If IsError(vEval) Then
EVAL = CVErr(xlErrValue)
Else
EVAL = vEval
End If
End If
Exit Function
funcfail:
EVAL = CVErr(xlErrNA)
End Function

Function RemovePunctuation(r As String) As String
' Adapted by Scott Huish, 8/23/2013
With CreateObject("vbscript.regexp")
.Pattern = "[^A-Za-z0-9\ ']"
.Global = True
RemovePunctuation = .Replace(r, "")
End With
End Function

Now invoke...

C2, copied down:
Rich (BB code):
=SUMPRODUCT((ISNUMBER(MATCH(eval("{"&CHAR(34)&
   SUBSTITUTE(removepunctuation(B2)," ",CHAR(34)&","&CHAR(34))&CHAR(34)&"}"),
   $A$2:$A$7,0))+0))
 
Upvote 0
Dear Anthony,

Many thanks for your assistance.

Actually, I thought I could use the method proposed by @njimack (link in the first post), and then use the count words function, but his function doesn't return repeated words and has some additional flows.

Your method works very well - many thanks, but what I need is exactly the opposite. That is, I need the "range" to be the keywords. ex: I need to know how many times the keyword(s) were mentioned in a given string.

Thanks again - really appreciate your help.
All Best//

 
Upvote 0
Dear Aladin,

Many many thanks for your help. Your function is really great, however I have run into one issue.

First, the function can handle really large number of keywords, i.e +2000. This is a very big advantage.

However, the function returns "0" when the string contains more than a certain number of characters. I couldn't really determine this number as it depends on the type of the character.

For example, if you try the following text, it wouldn't work. But if you remove the letter "v" from the end, it would. I really need it to work as the strings (text in cell) are large.

heart- smile in the morning, smile in the evenings, always == 5 I always believed in numbers and science, physics and metaphysics. I always believed that life will turn good. I've believed v

I wonder also if you can think of any other limitations that I might run into.

Again, thank you very much Aladin (really appreciate your assistance) and all best//
 
Upvote 0
Dear Aladin,

Many many thanks for your help. Your function is really great, however I have run into one issue.

First, the function can handle really large number of keywords, i.e +2000. This is a very big advantage.

However, the function returns "0" when the string contains more than a certain number of characters. I couldn't really determine this number as it depends on the type of the character.

For example, if you try the following text, it wouldn't work. But if you remove the letter "v" from the end, it would. I really need it to work as the strings (text in cell) are large.




I wonder also if you can think of any other limitations that I might run into.

Again, thank you very much Aladin (really appreciate your assistance) and all best//

I don't seem to have any problem with that string, i.e.:

heart- smile in the morning, smile in the evenings, always == 5 I always believed in numbers and science, physics and metaphysics. I always believed that life will turn good. I've believed v

I get a count of 2, related to "smile".
 
Upvote 0
Thanks Aladin for you reply,

My bad- I must have altered the contents before I posted. Please just add more text and you will see what I mean. I added the word "completely" and it returns "0" now. Please note that it just depends on the number of characters.

heart- smile in the morning, smile in the evenings, always == 5 I always believed in numbers and science, physics and metaphysics. I always believed that life will turn good. I've believed completely

Thanks again & all best//
 
Upvote 0
Thanks Aladin for you reply,

My bad- I must have altered the contents before I posted. Please just add more text and you will see what I mean. I added the word "completely" and it returns "0" now. Please note that it just depends on the number of characters.



Thanks again & all best//

The whoe string is 192 chars long and counts 31 words. The latter is also the number of array elements the formula works with (see the EVAL bit). May I ask you to check your Excel's help for "limits" in oder to assess the stuation.
 
Upvote 0
Here is a UDF (user defined function) that you can try. It takes two arguments, the first argument is the text string that you want look for the words in and the second is the range containing your list of words. So for the example you posted in Message #1, you would put this formula in cell C2 and copy it down to C7...

=WordListCount(B2,A$2:A$7)

Here is the code for the UDF...

Code:
Function WordListCount(TextToSearch As String, WordList As Range) As Long
  Dim WL As Variant, R As Long, C As Long, X As Long
  WL = WordList
  For R = 1 To UBound(WL)
    For C = 1 To UBound(WL, 2)
      For X = 1 To Len(TextToSearch)
        If Mid(" " & Ucase(TextToSearch) & " ", X, Len(WL(R, C)) + 2) _
                Like "[!0-9A-Z]" & Ucase(WL(R, C)) & "[!0-9A-Z]" Then
          WordListCount = WordListCount + 1
        End If
      Next
    Next
  Next
End Function
 
Upvote 0
Hi

Another option, assuming the keywords are in a contiguous vertical range. Ex.:

=KeywordCount(B1,A1:A10)

Code:
Function KeywordCount(sTextToSearch As String, rKeywords As Range) As Long
  
With CreateObject("VBScript.RegExp")
    .Pattern = "\b(" & Join(Application.Transpose(rKeywords), "|") & ")\b"
    .Global = True
    KeywordCount = .Execute(sTextToSearch).Count
End With
End Function
 
Upvote 0

Forum statistics

Threads
1,214,622
Messages
6,120,585
Members
448,972
Latest member
Shantanu2024

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top