VBA Words that Occur more than once in a string

Stephen_IV

Well-known Member
Joined
Mar 17, 2003
Messages
1,171
Office Version
  1. 365
  2. 2019
Platform
  1. Windows
Good morning, I am trying to match all words that occur more than once in a string, for example:

A1: The man ran ran down the the street.
B1: If the boy goes into the cave then the boy may get hurt.

I would like to return in cell A2: ran,ran,The,the,the
I would like to return in cell B2: The,the,the,boy,boy

Any help would be appreciated.
 

Excel Facts

Highlight Duplicates
Home, Conditional Formatting, Highlight Cells, Duplicate records, OK to add pink formatting to any duplicates in selected range.
Try:

Open a copy of your workbook. Press Alt-F11 to open the VBA editor. From the menu select Insert > Module. In the window that opens, paste the following code:'

Code:
Public Function WordDups(ByVal Str1 As String) As String
Dim a As Variant, MyDict As Object, i As Long, x As Variant
    
    a = Split(Replace(Replace(Replace(LCase(Str1), ".", ""), "?", ""), ",", ""))
    Set MyDict = CreateObject("Scripting.Dictionary")
    For i = 0 To UBound(a)
        MyDict(a(i)) = MyDict(a(i)) + 1
    Next i
    For Each x In MyDict
        If MyDict(x) > 1 Then
            For i = 1 To MyDict(x)
                WordDups = WordDups & "," & x
            Next i
        End If
    Next x
    WordDups = Mid(WordDups, 2)
            
End Function
Press Alt-Q to close the editor. In B2, put this formula:

=worddups(A1)

I remove periods, commas, and question marks so that the last word can be compared easily. I also convert the whole string to lower case. If you really want the upper case letters like in your example, let me know, but it will make it more complicated.

Hope this helps.
 
Upvote 0
Eric,

Perfect!!!! Does exactly what I need it to do. Thank you again for your help and support!
 
Upvote 0
Eric,

Perfect!!!! Does exactly what I need it to do. Thank you again for your help and support!
Two points about Eric's code...

1) Your original post showed you retaining the letter casing from the original text (capitalized "T" in "The" for example) when incorporated into the output text... Eric's code does not do this.

2) We do not know the full extent of what your text can look like, but I thought I should point out that Eric's code only allows for a period, comma or question mark to appear next to a word that is repeated elsewhere without the those symbols... if the repeated word appears next to any other non-alphanumeric character (for instance, "!", "-", "(", ")", "/", etc.) and there are only two of those words in the text, Eric's code will not report them as repeated words.

On the off-chance the above is or could be significant to you, here is a UDF (note I changed the name from the one Eric used) that handles both of the above cases...
Code:
[table="width: 500"]
[tr]
	[td]Function Dupes(ByVal S As String) As String
  Dim x As Long, V As Variant, Words() As String
  For x = 1 To Len(S)
    If Mid(S, x, 1) Like "[!0-9A-Za-z]" Then Mid(S, x) = " "
  Next
  Words = Split(Application.Trim(S), , , vbTextCompare)
  With CreateObject("Scripting.Dictionary")
    For x = 0 To UBound(Words)
      .Item(UCase(Words(x))) = Trim(.Item(UCase(Words(x))) & " " & Words(x))
    Next
    For Each V In .Items
      If InStrRev(V, " ") Then Dupes = Dupes & " " & V
    Next
  End With
  Dupes = Mid(Replace(Dupes, " ", ", "), 3)
End Function[/td]
[/tr]
[/table]
 
Upvote 0
Thanks Rick, you are correct that will make a difference and I appreciate your help and guidance!
 
Upvote 0

Forum statistics

Threads
1,215,746
Messages
6,126,650
Members
449,326
Latest member
asp123

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top