VBA Question: How to find duplicate rows, and sum values after finding duplicates?

rolamento

New Member
Joined
Apr 20, 2023
Messages
8
Office Version
  1. 2016
Platform
  1. Windows
Hello friends, I'm new to the forum but always try to get references for code here.
I'm currently trying to develop a VBA code to loop through a dataset, comparing the values from some rows, and if I find that all of them are equal, I have to sum values from some last rows to the first match, and delete the duplicate.
I have done it, but it's extremely slow, and considering the size of the dataset, it could that up to 7 hours to loop. I have seem some examples using COUNTIF or some kind of Find value. What would you guys recommend?
I'm posting my sample code below:


VBA Code:
Dim n As Long, j As Long, k As Long, count As Long, LastRow As Long, x As Long, y As Long
    Dim Arryk(), Arrnum() As Variant
    LastRow = Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
    
    
    Arryk = Array(5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)
    Arrnum = Array(33, 34, 35, 37)
    
    
    For x = 2 To LastRow
        LastRow = Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
        For y = LastRow To x Step -1
            If (Not x = y) And (Not IsEmpty(Cells(y, Arryk(0)))) And (Not IsEmpty(Cells(x, Arryk(0)))) Then
                
                count = 0
                For k = 0 To 27
                    If Cells(x, Arryk(k)) = Cells(y, Arryk(k)) Then
                        count = count + 1
                End If
                Next
                
                If count = 28 Then
                    If IsEmpty(Cells(x, 2)) Then
                        Cells(x, 2).Value = Cells(x, 1).Value & ", " & Cells(y, 1).Value
                    Else
                        Cells(x, 2).Value = Cells(x, 2).Value & ", " & Cells(y, 1).Value
                    End If
                    For j = 0 To 3
                        Cells(x, Arrnum(j)).Value = Round(WorksheetFunction.Sum(Cells(x, Arrnum(j)).Value, Cells(y, Arrnum(j)).Value), 2)
                    Next
                    Rows(y).Delete Shift:=xlUp
                End If
            End If
        Next y
    Next x
 

Excel Facts

Create a Pivot Table on a Map
If your data has zip codes, postal codes, or city names, select the data and use Insert, 3D Map. (Found to right of chart icons).
Give this a try. I collaborated with @kevin9999 on this and it uses some of your original code to test what columns determine a duplicate and what columns need to be summed.

VBA Code:
Sub UniqueSum_v02()

    Dim shtData As Worksheet
    Dim srcRngWhole As Range
    Dim srcArrWhole As Variant, outArr As Variant
    Dim LastRow As Long, LastCol As Long
    Dim i As Long, iOut As Long, j As Long, iCols As Long
    Dim dict As Object, dictKey As String
    Dim Arrnum() As Variant
    Dim Arryk() As Variant
       
    Dim t As Double: t = Timer
    Application.ScreenUpdating = False
    
    Arryk = Array(5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)
    Arrnum = Array(33, 34, 35, 37)
    
    Set shtData = ActiveSheet
    With shtData
        LastRow = .Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
        LastCol = .Cells.Find("*", SearchOrder:=xlByColumns, SearchDirection:=xlPrevious).Column
        Set srcRngWhole = .Range(.Cells(2, "A"), .Cells(LastRow, LastCol))
        srcArrWhole = srcRngWhole
    End With
    
    ReDim outArr(1 To UBound(srcArrWhole), 1 To UBound(srcArrWhole, 2))
    
    Set dict = CreateObject("Scripting.dictionary")
    
    ' Load details range into Dictionary
    For i = 1 To UBound(srcArrWhole)
        dictKey = ""
        For j = 0 To UBound(Arryk)
            dictKey = dictKey + srcArrWhole(i, Arryk(j))
        Next j
        
        If Not dict.exists(dictKey) Then
            iOut = iOut + 1
            dict(dictKey) = iOut
            For iCols = 1 To UBound(srcArrWhole, 2)
                outArr(dict(dictKey), iCols) = srcArrWhole(i, iCols)
                outArr(dict(dictKey), 2) = srcArrWhole(i, 1)
            Next iCols
            
        Else
            For iCols = 0 To UBound(Arrnum)
                outArr(dict(dictKey), Arrnum(iCols)) = outArr(dict(dictKey), Arrnum(iCols)) + srcArrWhole(i, Arrnum(iCols))
            Next iCols
            outArr(dict(dictKey), 2) = outArr(dict(dictKey), 2) & ", " & srcArrWhole(i, 1)
        End If
        
    Next i
    
    srcRngWhole.ClearContents
    srcRngWhole.Resize(iOut, UBound(outArr, 2)) = outArr
    
    Application.ScreenUpdating = True
    MsgBox "Completed in " & Timer - t & " Seconds"

End Sub
 
Upvote 2
Solution
I'd be surprised if we got this right first time without having access to your actual data, but this is a start. I'm assuming you wanted the data overwritten on the sheet, and that there isn't much point in retaining Column A (somewhat superseded by the concatenated IDs). Just change the sheet name to the actual sheet name & test it on a copy of your workbook.

VBA Code:
Option Explicit
Sub rolamento()
    Application.ScreenUpdating = False
    Dim t As Double: t = Timer
    Dim ws As Worksheet
    Set ws = Worksheets("Sheet1")   '<~~ Change to actual sheet name
    
    'Get the consolidated IDs
    ws.Range("B1").EntireColumn.Insert
    Dim Lrow As Long
    Lrow = Cells.Find("*", , xlFormulas, , xlByRows, xlPrevious).Row
    With ws.Range("B2:B" & Lrow)
        .Formula2R1C1 = "=Concat(RC6:RC33&"" | "")"
        .Value2 = .Value2
    End With
    
    With ws.Range("A1")
        .CurrentRegion.Sort Key1:=ws.Range("B1"), order1:=xlAscending, Header:=xlYes
        With .Range("C2:C" & Lrow)
            .Formula2R1C1 = "=concat(if(R2C2:R" & Lrow & "C2=RC2,R2C1:R" & Lrow & "C1&"","",""""))"
            .Value2 = .Value2
            .Value2 = Evaluate("Left(" & .Address & ",len(" & .Address & ")-1)")
        End With
    End With
    ws.Range("B1").EntireColumn.Delete
    
    'Do the Dictionary part
    Dim rng As Range, R As Range, txt As String
    Dim i As Long, j As Long, n As Long, ar
    Set rng = ws.Range(Range("B1"), Range("B" & Rows.Count).End(xlUp))
    ar = Sheet1.Range("A1").CurrentRegion.Offset(, 1).Resize(Columns.Count - 1)
    
    With CreateObject("scripting.dictionary")
        For Each R In rng
            txt = R.Value
            If Not .exists(txt) Then
                n = n + 1
                .Add txt, n
                For j = 1 To UBound(ar, 2)
                    ar(n, j) = R.Offset(, j - 1)
                Next j
            Else
                For i = 32 To 34
                    ar(.Item(txt), i) = ar(.Item(txt), i) + R.Offset(, i - 1)
                Next i
                ar(.Item(txt), 36) = ar(.Item(txt), 36) + R.Offset(, 35)
            End If
        Next R
    End With
    
    For i = 1 To UBound(ar)
        For j = 2 To UBound(ar, 2) - 1
            ar(i, j) = ar(i, j)
        Next j
    Next i
  
    With ws
        .UsedRange.ClearContents
        .Range("A1").Resize(n, UBound(ar, 2)) = ar
    End With
    Application.ScreenUpdating = True
    MsgBox "Completed in " & Timer - t & " Seconds"
End Sub
 
Upvote 1
I wasn't happy with my first effort - it was taking too long to run. Please ignore post #8 and try this code instead.

VBA Code:
Option Explicit
Sub rolamento_V2()
    Application.ScreenUpdating = False
    Dim t As Double: t = Timer
    Dim ws As Worksheet
    Set ws = Worksheets("Sheet1")   '<~~ Change to actual sheet name
    
    'Get the consolidated IDs
    Dim Lrow As Long
    Lrow = Cells.Find("*", , xlFormulas, , xlByRows, xlPrevious).Row
    With ws.Range("B2:B" & Lrow)
        .Formula2R1C1 = "=Concat(RC5:RC32&"" "")"
        .Value2 = .Value2
    End With
    ws.Range("A1").CurrentRegion.Sort Key1:=ws.Range("B1"), order1:=xlAscending, Header:=xlYes
    
    Dim a, b, temp, i As Long, j As Long, n As Long, s As String, ar
    a = ws.Range("A2:B" & Lrow)
    ReDim b(1 To UBound(a, 1), 1 To 1)
    
    For i = 2 To UBound(a, 1)
        If i = 2 Then b(i - 1, 1) = a(i - 1, 1)
        If a(i, 2) = a(i - 1, 2) And a(i, 1) = a(i - 1, 1) Then
            b(i, 1) = b(i - 1, 1)
        ElseIf a(i, 2) = a(i - 1, 2) And a(i, 1) <> a(i - 1, 1) Then
            b(i, 1) = b(i - 1, 1) & ", " & a(i, 1)
        Else
            b(i, 1) = a(i, 1)
        End If
    Next i
        
    For i = UBound(a, 1) To 2 Step -1
        If a(i, 2) = a(i - 1, 2) Then
            b(i - 1, 1) = b(i, 1)
        Else
            b(i - 1, 1) = b(i - 1, 1)
        End If
    Next i
    ws.Range("B2").Resize(UBound(b, 1)).Value = b
        
    'Do the Dictionary part
    Dim rng As Range, R As Range, txt As String
    Set rng = ws.Range(Range("B1"), Range("B" & Rows.Count).End(xlUp))
    ar = Sheet1.Range("A1").CurrentRegion.Offset(, 1).Resize(Columns.Count - 1)
    
    With CreateObject("scripting.dictionary")
        For Each R In rng
            txt = R.Value
            If Not .exists(txt) Then
                n = n + 1
                .Add txt, n
                For j = 1 To UBound(ar, 2)
                    ar(n, j) = R.Offset(, j - 1)
                Next j
            Else
                For i = 32 To 34
                    ar(.Item(txt), i) = ar(.Item(txt), i) + R.Offset(, i - 1)
                Next i
                ar(.Item(txt), 36) = ar(.Item(txt), 36) + R.Offset(, 35)
            End If
        Next R
    End With
    
    For i = 1 To UBound(ar)
        For j = 2 To UBound(ar, 2) - 1
            ar(i, j) = ar(i, j)
        Next j
    Next i
  
    With ws
        .UsedRange.ClearContents
        .Range("A1").Resize(n, UBound(ar, 2)) = ar
    End With
    Application.ScreenUpdating = True
    MsgBox "Completed in " & Timer - t & " Seconds"
End Sub
 
Upvote 1
Could you provide a sample of your data using the XL2BB - Excel Range to BBCode or share your file via Dropbox, Google Drive or similar file sharing site?
Hi here's a sample dataset:

Book1
ABCDEFGHIJKLMNOPQRSTUVWXYZAAABACADAEAFAGAHAIAJAK
1IdsConsolidate IdsNotNot5678910111213141516171819202122232425262728293031323334353637
21abcabcabcabcabcabcabcabcabcxyzxyzxyzxyzxyzxyzabcabcabcabcabcabcabcabcabcabcabcabcabc128221ignore2
32abcabcabcabcabcabcabcabcabcxyzxyzxyzxyzxyzxyzabcabcabcabcabcabcabcabcabcabcabcabcabc178829ignore4
43xyzxyzxyzxyzxyzxyzxyzxyzxyzabcabcabcabcabcabcxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyz229437ignore6
54abcabcabcabcabcabcabcabcabcxyzxyzxyzxyzxyzxyzabcabcabcabcabcabcabcabcabcabcabcabcabc2710045ignore8
65xyzxyzxyzxyzxyzxyzxyzxyzxyzabcabcabcabcabcabcxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyz3210653ignore10
76abcabcabcabcabcabcabcabcabcxyzxyzxyzxyzxyzxyzabcabcabcabcabcabcabcabcabcabcabcabcabc3711261ignore12
87abcabcabcabcabcabcabcabcabcxyzxyzxyzxyzxyzxyzabcabcabcabcabcabcabcabcabcabcabcabcabc4211869ignore14
98xyzxyzxyzxyzxyzxyzxyzxyzxyzabcabcabcabcabcabcxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyz4712477ignore16
109abcabcabcabcabcabcabcabcabcxyzxyzxyzxyzxyzxyzabcabcabcabcabcabcabcabcabcabcabcabcabc5213085ignore18
1110abcabcabcabcabcabcabcabcabcxyzxyzxyzxyzxyzxyzabcabcabcabcabcabcabcabcabcabcabcabcabc5713693ignore20
Sheet1


Applying said code should return only 2 rows, with the concatenated duplicates. However, I would need a faster script, maybe using something that doesn't take that long to loop. Any ideas would be appreciated!
 
Upvote 0
So is it columns AG, AH and AK that you want to consolidate the values from? Fairly straightforward with a Dictionary. But what do want done about the values/labels in column A, which appear to be unique?
 
Upvote 0
So is it columns AG, AH and AK that you want to consolidate the values from? Fairly straightforward with a Dictionary. But what do want done about the values/labels in column A, which appear to be unique?
Column A has a number ID. It's basically the row number + 1. If two rows are duplicates, I have to join their IDs at column B.
Example: Rows 2 and 3 are duplicates. I delete row 3 (after adding the numeric values) and write (1, 2) (their corresponding IDs) to column B. If there's more duplicate rows, I would keep adding them to the first row. Like (1, 2, 5, 70) etc.
Could you explain how can I check for duplicates using a dictionary? The method I'm using is kinda slow and inneficient.
 
Upvote 0
Leave it with me, I'm going to be unavailable for a while now but I will get back to you.
 
Upvote 0

Forum statistics

Threads
1,215,136
Messages
6,123,243
Members
449,093
Latest member
Vincent Khandagale

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top