Vba to find duplicates in a cell

Chefsohail

Board Regular
Joined
Oct 3, 2020
Messages
90
Office Version
  1. 365
Platform
  1. Windows
Hi Team,

I need your expertise with a code that can highlight all cells in a row if the cell contains any duplicate statement. The duplication has to be within the cell.

Eg:
1606626680776.png


I am not sure if this is practically possible in excel. In case if it isn't at least the below alternative will help.

Alternative: A code that can highlight all cells in a row if the cell contains any duplicate words except for articles, prepositions and conjunctions. The duplication has to be within the cell.

1606626267603.png


I hope the examples above illustrates the requirement. I do not need the remarks column. Only the cell needs to be highlighted.
 

Excel Facts

Round to nearest half hour?
Use =MROUND(A2,"0:30") to round to nearest half hour. Use =CEILING(A2,"0:30") to round to next half hour.
I think the alternative is possible. I have assumed data in column A, starting in row 2.

VBA Code:
Sub Highlight_Dupes()
  Dim RX As Object, d As Object
  Dim i As Long
  Dim a As Variant, itm As Variant
  
  Set d = CreateObject("Scripting.Dictionary")
  d.CompareMode = 1
  Set RX = CreateObject("VBScript.RegExP")
  RX.Global = True
  RX.Pattern = "\b[^ ]+?\b"
  With Range("A2", Range("A" & Rows.Count).End(xlUp))
    a = .Value
    For i = 1 To UBound(a)
      d.RemoveAll
      For Each itm In RX.Execute(a(i, 1))
        If d.exists(CStr(itm)) Then
          .Cells(i).Interior.Color = vbYellow
          Exit For
        Else
          d(CStr(itm)) = 1
        End If
      Next itm
    Next i
  End With
End Sub

Sample data and results:

Chefsohail.xlsm
A
1Data
2I need your expertise with a code that can highlight all cells in a row if the cell contains any duplicate statement.
3Duplication has to be within the cell.
4John is a good boy and a good student
5Ann is a tall girl
Sheet1


An option that you might want to consider is to say the words to be checked must be at least a certain minimum number of characters. For example, cell A2 above is highlighted because the word "a" occurs more than once. You may wish to discount short words?
 
Upvote 0
Hey Peter,

Thanx for looking into it.

I am happy with the alternative and I like your idea to keep a minimum number of characters.

Is there a way to eliminate conjunctions, prepositions and articles in the code you provided?

Or as you suggested we can incorporate a number of characters let's say 25. I can then change as per the requirement.

If you can incorporate both in one code or either that will be great.
 
Upvote 0
Is there a way to eliminate conjunctions, prepositions and articles in the code you provided?
Excel does not automatically 'know' such words, but if you provided a list of words to exclude (say in another column) then they could be ignored when checking.

Or as you suggested we can incorporate a number of characters let's say 25.
My suggestion was to ignore "words" with fewer than a given number of letters, so it would definitely not be 25. :)

Here is an example of that. I have set the minimum letters required as 5 so
- cell A2 is not highlighted because the repeated word "book" is less than 5 letters
- "books" repeated in A3 meets the 5 letter test
- "the" in A4 is ignored
- "elephant" in A5 meets the test
- "red" in A6 is ignored but "train" is not so that means it gets highlighted

VBA Code:
Sub Highlight_Dupes_Min_Letters()
  Dim RX As Object, d As Object
  Dim i As Long
  Dim a As Variant, itm As Variant
  
  Const MinLetters As Long = 5  '<- Edit as required
  
  Set d = CreateObject("Scripting.Dictionary")
  d.CompareMode = 1
  Set RX = CreateObject("VBScript.RegExP")
  RX.Global = True
  RX.Pattern = "\b[^ ]{" & MinLetters & ",}\b"
  With Range("A2", Range("A" & Rows.Count).End(xlUp))
    a = .Value
    For i = 1 To UBound(a)
      d.RemoveAll
      For Each itm In RX.Execute(a(i, 1))
        If d.exists(CStr(itm)) Then
          .Cells(i).Interior.Color = vbYellow
          Exit For
        Else
          d(CStr(itm)) = 1
        End If
      Next itm
    Next i
  End With
End Sub

Chefsohail.xlsm
A
1Data
2book paper book
3books paper books
4the cat sat on the mat
5one elephant chased another elephant
6red train, red car, blue train
Sheet2
 
Upvote 0
Thanks Peter for clarifying. Your suggestion absolutely makes sense.

I plan to incorporate both of your ideas. 1. provided a list of words to exclude and 2. set the minimum letters required as 2

Please find the words below that need to be excluded -
the, an, a, while, wherever, where, whenever, when, though, that, than, so that, since, only, once, lest, if, even, because, as, although, so, yet, or, but, Nor, And, For, up, under, toward, to, till, through, over, outside, of, onto, on, off, next to, near, into, inside, in, from, for, during, down, by, between, beside, beneath, below, behind, before, away from, at, around, among, along, against, after, across, above.

these are 65 words.

I can manage to change this 'Const MinLetters As Long = 5' to 2. However how do we add a command to exclude the above words? Please extend support. Also if i can later add / remove words to this list of 65, that'll be great.. I wish i could write codes as you do..
 
Upvote 0
I plan to incorporate both of your ideas. 1. provided a list of words to exclude and 2. set the minimum letters required as 2
It seems pointless to me to do both if you are going to set the minimum number to 2. As far as I know there are only two words with fewer than 2 letters and you already have one of them ("a") in your list of exclusions. Why not juts add "I" to the list of exclusions and scrap the idea of a minimum number of letters?
 
Upvote 0
It seems pointless to me to do both if you are going to set the minimum number to 2. As far as I know there are only two words with fewer than 2 letters and you already have one of them ("a") in your list of exclusions. Why not juts add "I" to the list of exclusions and scrap the idea of a minimum number of letters?
I need both because I have multiple combinations to be taken care..
For example...

1. M.M Cyprus - this doesn't need to be highlighted
2. Peter is excellent at macro codes. He is a master in Excel and VBA. - this shouldn't be highlighted for Is.

There are many paragraphs in each cell in my data which has repetition of conjunctions, interjections, prepositions and articles. These words may be more than 2 letters but they do not require to be highlighted if repeated. Does this make sense?
 
Upvote 0
I need both because I have multiple combinations to be taken care..
For example...

1. M.M Cyprus - this doesn't need to be highlighted
OK, understood


Peter is excellent at macro codes. He is a master in Excel and VBA. - this shouldn't be highlighted for Is.
Why not? "is" does meet the minimum letter requirement of 2 and does not appear in the exclusion list


I also note that your list of excluded words includes some items that are not words (eg "so that" and "next to"). This may cause some accuracy issues down the line but we can press on for now once the above is resolved.
 
Upvote 0
OK, understood



Why not? "is" does meet the minimum letter requirement of 2 and does not appear in the exclusion list


I also note that your list of excluded words includes some items that are not words (eg "so that" and "next to"). This may cause some accuracy issues down the line but we can press on for now once the above is resolved.
I may have missed on adding 'is' to the list of words. Sorry about that
 
Upvote 0
Try this version

VBA Code:
Sub Highlight_Dupes_Min_Letters_v2()
  Dim RX As Object, d As Object
  Dim i As Long
  Dim a As Variant, itm As Variant
  Dim s As String, Pat1 As String, Pat2 As String
  
  Const MinLetters As Long = 2  '<- Edit as required
  Const IgnoreWords As String = "is|the|an|a|while|wherever|where|whenever|when|though|that|than|so that|since|only|once|lest|if|even|because|as|although" _
                                & "|so|yet|or|but|Nor|And|For|up|under|toward|to|till|through|over|outside|of|onto|on|off|next to|near|into|inside|in|from" _
                                & "|for|during|down|by|between|beside|beneath|below|behind|before|away from|at|around|among|along|against|after|across|above"
  
  Set d = CreateObject("Scripting.Dictionary")
  d.CompareMode = 1
  Set RX = CreateObject("VBScript.RegExP")
  RX.Global = True
  Pat1 = "\b(" & IgnoreWords & ")\b"
  Pat2 = "\b[^ ]{" & MinLetters & ",}\b"
  With Range("A2", Range("A" & Rows.Count).End(xlUp))
    a = .Value
    For i = 1 To UBound(a)
      d.RemoveAll
      s = a(i, 1)
      RX.Pattern = Pat1
      s = RX.Replace(s, "")
      RX.Pattern = Pat2
      For Each itm In RX.Execute(s)
        If d.exists(CStr(itm)) Then
          .Cells(i).Interior.Color = vbYellow
          Exit For
        Else
          d(CStr(itm)) = 1
        End If
      Next itm
    Next i
  End With
End Sub

My small sample data and results:

Chefsohail.xlsm
A
1Data
2elephant next to rhino next to hippo
3book paper book
4books paper books
5the cat sat on the mat
6one elephant chased another elephant
7red train, red car, blue train
8M.M Cyprus
9Peter is excellent at macro codes. He is a master in Excel and VBA
10Peter is hopeless at cooking and hopeless as a debater
Sheet2
 
Upvote 0

Forum statistics

Threads
1,215,219
Messages
6,123,690
Members
449,117
Latest member
Aaagu

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top