Remove Duplicates in Keeping MAX Numerical/Alphanumeric Value from the Deleted Row's Duplicate

Pinaceous

Well-known Member
Joined
Jun 11, 2014
Messages
1,113
Office Version
  1. 365
Platform
  1. Windows
In using Excell 2013 ->

Here is my Data:

Column 1 | C2 | C3 | C4 | C5 | C6 | C7 |C8| C9 |C10|C11|C12|C13|C14|C15|C16|C17|C18|C19| C20|

XY 1559 | ABC | XY | 1559 | 1559 | ABC | 159 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11| 12 | 13 |

XY 1559 | | XY | 1559 | 1645 | ABC | 250 | | | | | | | | | | | | | |

XY 1559 | | XY | 1559 | 1559 | ABC | 159 | | | | | | | | | | | | | |


________________________________________________

What I would theoretically like; by way of VBA Method is to Delete Duplicates by way of the Common Qualifier in Column 1 in keeping the Maximum Numerical Values of its predecessor’s rows with the same like Qualifier.

So then the Data will look after Macro execution; Control + Shift + Whatever, to then look like this below, in following this example for a large data base:</SPAN>


Column 1| C2 | C3 | C4 | C5 | C6 | C7 |C8|C9|C10|C11|C12|C13|C14|C15|C16|C17|C18|C19|C20|

XY 1559 | ABC | XY | 1559 | 1645 | ABC | 250 | 1 | 2| 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 |



It seems like such a simple concept, but I'm getting no success with this one. Please Help!


:confused:
 

Excel Facts

Back into an answer in Excel
Use Data, What-If Analysis, Goal Seek to find the correct input cell value to reach a desired result
Okay M. This is Great. Just, How do I restrict this Macro to only execute between Excel's Column's C to Column V => in area of $C$1:$V$574?


:confused:

Try this (assumes data in Sheet1)

Code:
Sub aTest()
    Dim i As Long, j As Long
    Dim vData As Variant, v As Variant, lin As Long
    Dim dict As Object
    
    Set dict = CreateObject("Scripting.Dictionary")
    dict.comparemode = vbTextCompare
    
    With Sheets("Sheet1").Range("C1:V574")
        vData = .Value
        
        For i = 1 To .Rows.Count
            If vData(i, 1) <> "" Then
                If dict.Exists(vData(i, 1)) Then
                    For j = 2 To .Columns.Count
                        If .Cells(i, j) > dict.Item(vData(i, 1)).Cells(1, j) Then _
                            dict.Item(vData(i, 1)).Cells(1, j) = .Cells(i, j)
                    Next j
                Else
                    dict.Add vData(i, 1), .Range(.Cells(i, 1), .Cells(i, .Columns.Count)).Cells
                End If
            End If
        Next i
        
        For Each v In dict.keys
            lin = lin + 1
            .Range("C" & lin).Resize(, .Columns.Count).Value = dict.Item(v).Cells.Value
        Next v
        
        .Rows(dict.Count + 1 & ":" & 574).ClearContents
    
    End With
        
End Sub

M.
 
Upvote 0
oops...

Please correct this code line

Code:
For Each v In dict.keys
            lin = lin + 1
            [COLOR=#ff0000].Range("C" & lin)[/COLOR].Resize(, .Columns.Count).Value = dict.Item(v).Cells.Value
Next v

to

Code:
For Each v In dict.keys
            lin = lin + 1
            [COLOR=#0000ff].Range("A" & lin)[/COLOR].Resize(, .Columns.Count).Value = dict.Item(v).Cells.Value
 Next v

M.
 
Upvote 0
Hi Marcelo,

I tried out the two variations. As it looks, I actually had better success with .Range("C" & lin) than .Range("A" & lin).
For some reason the data is better represented with Range “C” as you can see below in the before and after tables.
"Before Table"; "C" & "A", respectively;



Column 1C2C3C4C5C6C7C8C9C10C11C12C13C14C15C16C17C18C19C20
XY 1559ABCXY15592345ABC25012345678910111213
YZ 1444EFGYZ14441834EFG222135791113
AB 1111HIJAB11111422JKL42324681012
CD 123KELCD1232021MNO142543
XY 1559ABCXY15590009ABC27513121110987
XY 1559XY15592355ABC250
YZ 1444YZ14441855EFG222
AB 1111AB11111456JKL423
CD 123CD1232045MNO142
XY 1559XY15590028ABC275
XY 1559XY15592345ABC250
YZ 1444YZ14441834EFG222
AB 111AB11111422JKL423
CD 123CD1232021MNO142
XY 1559XY15590009ABC275

<tbody>
</tbody>




After "C"





Column 1C2C3C4C5C6C7C8C9C10C11C12C13C14C15C16C17C18C19C20
XY ABCABCXYABCXYABC2355ABC2601312111098910111213
YZ 1444EFGYZ1444YZEFG1855EFG23235791113
AB 1111HIJAB1111ABJKL1456JKL4334681012
CD 123KELCD123CDMNO2045MNO1723
ABC

<tbody>
</tbody>



After "A"









Column 1C2C3C4C5C6C7C8C9C10C11C12C13C14C15C16C17C18C19C20
XYABCXYABC2355ABC2751312111098910111213
YZ1444YZEFG1855EFG22235791113
AB
1111
AB
JKL
1456
JKL
423
4
6
8
10
12
CD
123
CD
MNO
2045
MNO
142
3

<tbody>
</tbody>




Ideally, I would like the data to compile after execution to behave:
"After Table"






Column 1C2C3C4C5C6C7C8C9C10C11C12C13C14C15C16C17C18C19C20
XY 1559ABCXY15590028ABC27513121110987
AB 1111HIJAB11111456JKL42324681012
CD 123KELCD1232045MNO142543
YZ 1444EFGYZ14441856EFG222135791113
XY1559ABCXY15592355ABC25012345678910111213

<tbody>
</tbody>





Where here this "After Table" above represents after the execution is applied to the original "Before Table" data.
Here you will have to add an additional seeded code, either into the original code, which works fine in both taking the MAX value and in its after representation, or to this evolving code that you later suggested.
This additional seeded code must include an "if and only if" exception, written into the current code, where it treats the difference of numerical value of Column 5 greater than 1200 as a separate entry.
Please take note that Column 1 is really Column C in the code, consecutively forward, making Column 5 really Column G for the restricted domain of the Code being C2 to V573.
Also, please note that it will sort the "new" information according to an ascending values in this Column 5; (my Excel's Column G).
Many thanks in advance.
R/
Pinaceous
 
Last edited:
Upvote 0
Sorry, I don't understand what you mean by

"This additional seeded code must include an "if and only if" exception, written into the current code, where it treats the difference of numerical value of Column 5 greater than 1200 as a separate entry."

Difference to/from what?

----------------------------------------

Sorry, my code above is not correct. My bad

Try (adapted to range C2:V573)

Code:
Sub aTest()
    Dim i As Long, j As Long
    Dim vData As Variant, v As Variant, lin As Long
    Dim dict As Object
    
    Set dict = CreateObject("Scripting.Dictionary")
    dict.comparemode = vbTextCompare
    
    With Sheets("Sheet1").Range("C2:V573")
        vData = .Value
        
        For i = 1 To .Rows.Count
            If vData(i, 1) <> "" Then
                If dict.Exists(vData(i, 1)) Then
                    For j = 2 To .Columns.Count
                        If .Cells(i, j) > dict.Item(vData(i, 1)).Cells(1, j) Then _
                            dict.Item(vData(i, 1)).Cells(1, j) = .Cells(i, j)
                    Next j
                Else
                    dict.Add vData(i, 1), .Rows(i).Cells
                End If
            End If
        Next i
                
        For Each v In dict.keys
            lin = lin + 1
            .Range("A" & lin).Resize(, .Columns.Count).Value = dict.Item(v).Cells.Value
        Next v
        
        .Rows(dict.Count + 1 & ":" & 573).ClearContents
    End With
End Sub

M.
 
Upvote 0
Hi Marcelo,

Thanks for the updated code, as far as the additional requirement I need in the code I'll try and explain it by these two tables:



Column 1C2C3C4C5C6C7C8C9C10C11C12C13C14C15C16C17C18C19C20
XY 1559ABCXY15592325ABC15912345678910111213
XY 1559XY15592359ABC250
XY 1559XY15592325ABC159
XY 1559ABCXY15590009ABC19912345678910111213
XY 1559XY15590025ABC243
XY 1559XY15590009ABC199

<tbody>
</tbody>


Okay, the top table above represents before the applied code.

The bottom table here represents the executed code with the "if and only if" exception, written into the current code or the original, where it treats the difference of numerical values of Column 5 greater than 1200 as a separate entry.

Please note that Column 1 is really where Column C starts in the code, consecutively forward, making Column 5 really Column G for the restricted domain of the code being C2:V5733.




Column 1C2C3C4C5C6C7C8C9C10C11C12C13C14C15C16C17C18C19C20
XY 1559ABCXY15590025ABC24312345678910111213
XY 1559ABCXY15592359ABC25012345678910111213

<tbody>
</tbody>



Also take note that it arranges/sorts the data in ascending order according to Column 5.

Thanks M.

- P
 
Upvote 0
Still not clear for me - what you mean by difference of values?

Or are you saying that rows with column 5 equal or greater than 1200 should be grouped differently than rows with column 5 less than 1200?

M.
 
Upvote 0
Hi M,

It is a little confussing I'd admit, let's use the last Entry Data Table's and Table Result Above to help me explain in math terms how I might get my point accross.

So for the above entry, its last Table Result is represented as:



Column 1
C2
C3
C4
C5
C6
C7
C8
C9
C10
C11
C12
C13
C14
C15
C16
C17
C18
C19
C20
XY 1559
ABC
XY
1559
0025
ABC
243
1
2
3
4
5
6
7
8
9
10
11
12
13
XY 1559
ABC
XY
1559
2359
ABC
250
1
2
3
4
5
6
7
8
9
10
11
12
13

<TBODY>
</TBODY>



Here, we see that in Column 5 Table Result, for its first entry is 0025 and its second entry is 2359. Here there difference between these two MAX sets is greater than 1200, so Excel will therefore group these values as two different entries and will not delete it. Here, this difference is infact [2359(MAX for this range of data)-0025(MAX for this range of data)='s] 2334, which is much larger than 1200, so Excel will now group these two entries separately.

Please let me know, if you would like me to repharse.

Thanks,

-P
 
Upvote 0
A question:
the values in column 5 are Text or Numbers formatted as 00000?

To check in an empty cell put
=ISNUMBER(G3)
tell us what is returned

M.
 
Upvote 0

Forum statistics

Threads
1,216,109
Messages
6,128,876
Members
449,476
Latest member
pranjal9

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top