Identifying partial duplicate strings in ONE column

JSoerum

New Member
Joined
Sep 22, 2022
Messages
2
Office Version
  1. 365
Platform
  1. Windows
Hi everyone!

I am looking for a solution to a problem that I have, and I hope someone here can help me out!

My issue is that I have a long column containing questions in text, some the same and some different, (at the moment 800 rows in a single column) with strings with more than 255 characters, where I would like to identify whether some of the questions are repeated, and if they are repeated how many times are they repeated. The aim is to identify certain trends within the question, if some questions are asked more often than others etc.

The problem is that they are not completely identical, some strings might vary with a "." or space etc. As I am working in the desktop version of Excel / Power BI, I am not able to use the Fuzzy/cluster functions in BI.
As the strings exceed 255 characters, I am not even able to use conditional formatting with highlight duplicates to find the exact matches.

I have found a VBA solution somewhere on this side in a thread, (sorry for no credits), that can highlight exact duplicates, but I still don't have the number of identical matches.

VBA Code:
Option Explicit
Option Base 1


Sub FindDuplicate()
Dim MyRg()
Dim I As Long, II As Long, J As Long
Dim DupliFlag As Boolean
    MyRg = Selection
    For I = 1 To UBound(MyRg, 1) - 1
        For II = I + 1 To UBound(MyRg, 1)
            For J = 1 To UBound(MyRg, 2)
                If (MyRg(I, J) <> MyRg(II, J)) Then
                    DupliFlag = False
                    Exit For
                Else
                    DupliFlag = True
                End If
            Next J
            If (DupliFlag) Then
                Selection.Cells(I, 1).EntireRow.Cells.Interior.ColorIndex = 19
                Selection.Cells(II, 1).EntireRow.Cells.Interior.ColorIndex = 19
                DupliFlag = False
            End If
        Next II
    Next I
End Sub

Does any of you have a solution to this problem?

Thanks in advance!!
 

Excel Facts

Format cells as time
Select range and press Ctrl+Shift+2 to format cells as time. (Shift 2 is the @ sign).
The sample code works for whole string in cell, not partial string in cell.
Pls post few rows of actual data.
 
Upvote 0
Tbh the code is not what I worry too much about. It works so that I can highlight the exact similarities, but it does not solve the problem of counting the number of similar questions.

Due to confidentiality I am not able to provide the exact data, but imagine in the following screenshot that the random characters is different questions. (I put the bold text displaying Q1, Q2..... just for the purpose of highlighting the similar text bits). For Q3 there are 3 similar questions just differentiated by a number in the beginning of the text. The same goes for Q5. The idea is that I would like to be able to count that for Question 3 there are three similar questions, and for Question 5, there is 2 similar.

Does that make sense?

Best
 

Attachments

  • Screenshot.png
    Screenshot.png
    40.1 KB · Views: 18
Upvote 0

Forum statistics

Threads
1,215,054
Messages
6,122,893
Members
449,097
Latest member
dbomb1414

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top