Remove Duplicates from a Range
Results 1 to 10 of 10

Thread: Remove Duplicates from a Range
Thanks Thanks: 0 Likes Likes: 0

  1. #1
    Board Regular Caleeco's Avatar
    Join Date
    Jan 2016
    Location
    United Kingdom
    Posts
    882
    Post Thanks / Like
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    Question Remove Duplicates from a Range

    Hello,

    I am attempting to remove duplicates from a set range. However, I need it to be only the range defined and not the entire column.

    I read an example online whereby a unique list is created by adding each row of data to a collection (resume next on errors).

    My code below correctly identifies the number of unique values in the data set stored in range A1:A7.

    However, I cant work out how to then paste each item of the collection in cell D5 onwards (D6, D7, D8, etc)

    Any help is much appreciated!

    Code:
    Sub RemoveDuplicate()
    
    
    Dim NoDuplicates As New Collection
    Dim rData As Range
    Dim Cell As Range
    
    
    Dim lrow As Long
    Dim Item As Object
    
    
    Set rData = ActiveSheet.Range("A1", Range("A" & Rows.Count).End(xlUp))
    
    
    On Error Resume Next
    For Each Cell In rData
        NoDuplicates.Add Cell.Value, CStr(Cell.Value)
    Next
    
    
    MsgBox "Number of Unique Values: " & NoDuplicates.Count
    
    
    lrow = 5
    For Each Item In NoDuplicates
        ActiveSheet.Range("D" & lrow) = Item.Value
        lrow = lrow + 1
    Next Item
    
    
    End Sub
    Last edited by Caleeco; Feb 1st, 2016 at 04:20 PM. Reason: spelling mistakes
    https://www.ExcelWTF.com - Helping you solve those WTF moments. If you want to learn VBA, head over to https://www.ExcelWTF.com/learn

  2. #2
    Board Regular Caleeco's Avatar
    Join Date
    Jan 2016
    Location
    United Kingdom
    Posts
    882
    Post Thanks / Like
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Remove Duplicates from a Range

    Ok, got it working after some research. Needed to define the items of the collection as VARIANT. Also I wasn't calling the data correctly. The code below does what I want, if anyone knows a better way to do it, I'm open to suggestions!

    Code:
    Sub RemoveDuplicate()
    
    
    Dim NoDuplicates As New Collection
    Dim rData As Range
    Dim Cell As Range
    
    
    Dim lrow As Long
    Dim varItem As Variant
    
    
    Set rData = ActiveSheet.Range("A1", Range("A" & Rows.Count).End(xlUp))
    
    
    On Error Resume Next
    For Each Cell In rData
        NoDuplicates.Add Cell.Value, CStr(Cell.Value)
    Next
    
    
    MsgBox "Number of Unique Values: " & NoDuplicates.Count
    
    
    lrow = 5
    For Each varItem In NoDuplicates
        ActiveSheet.Range("D" & lrow) = NoDuplicates(varItem)
        lrow = lrow + 1
    Next varItem
    
    
    End Sub
    https://www.ExcelWTF.com - Helping you solve those WTF moments. If you want to learn VBA, head over to https://www.ExcelWTF.com/learn

  3. #3
    MrExcel MVP
    Join Date
    May 2009
    Posts
    16,379
    Post Thanks / Like
    Mentioned
    36 Post(s)
    Tagged
    8 Thread(s)

    Default Re: Remove Duplicates from a Range

    Quote Originally Posted by Caleeco View Post
    Ok, got it working after some research. Needed to define the items of the collection as VARIANT. Also I wasn't calling the data correctly. The code below does what I want, if anyone knows a better way to do it, I'm open to suggestions!

    Code:
    Sub RemoveDuplicate()
    
    
    Dim NoDuplicates As New Collection
    Dim rData As Range
    Dim Cell As Range
    
    
    Dim lrow As Long
    Dim varItem As Variant
    
    
    Set rData = ActiveSheet.Range("A1", Range("A" & Rows.Count).End(xlUp))
    
    
    On Error Resume Next
    For Each Cell In rData
        NoDuplicates.Add Cell.Value, CStr(Cell.Value)
    Next
    
    
    MsgBox "Number of Unique Values: " & NoDuplicates.Count
    
    
    lrow = 5
    For Each varItem In NoDuplicates
        ActiveSheet.Range("D" & lrow) = NoDuplicates(varItem)
        lrow = lrow + 1
    Next varItem
    
    
    End Sub
    I prefer using a dictionary object over collections. The code below uses a dictionary to extract the unique values from column A and lists them starting in D5.
    Code:
    Sub RemoveDupsWithDictionary()
    Dim dataRng As Range, V As Variant
    Dim d As Object
    Set dataRng = Range("A1:A" & Cells(Rows.Count, "A").End(xlUp).Row)
    V = dataRng.Value  'places values in data range into an array for faster processing
    Set d = CreateObject("Scripting.dictionary")
    For i = 1 To UBound(V, 1)
        If Not d.exists(V(i, 1)) Then
            d.Add V(i, 1), d.Count
        End If
    Next i
    Application.ScreenUpdating = False
    Range("D5").Resize(d.Count, 1).Value = Application.Transpose(d.keys)
    Application.ScreenUpdating = True
    End Sub
    Joe

    When I was a young man I knew everything. Now that I'm older, I realize I know very little, and what I do know, I tend to forget!

  4. #4
    Board Regular
    Join Date
    Jan 2006
    Posts
    891
    Post Thanks / Like
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Remove Duplicates from a Range

    Think it should be

    ActiveSheet.Range("D" & lrow) = varItem

  5. #5
    Board Regular Caleeco's Avatar
    Join Date
    Jan 2016
    Location
    United Kingdom
    Posts
    882
    Post Thanks / Like
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Remove Duplicates from a Range

    Quote Originally Posted by JoeMo View Post
    I prefer using a dictionary object over collections. The code below uses a dictionary to extract the unique values from column A and lists them starting in D5.
    Code:
    Sub RemoveDupsWithDictionary()
    Dim dataRng As Range, V As Variant
    Dim d As Object
    Set dataRng = Range("A1:A" & Cells(Rows.Count, "A").End(xlUp).Row)
    V = dataRng.Value  'places values in data range into an array for faster processing
    Set d = CreateObject("Scripting.dictionary")
    For i = 1 To UBound(V, 1)
        If Not d.exists(V(i, 1)) Then
            d.Add V(i, 1), d.Count
        End If
    Next i
    Application.ScreenUpdating = False
    Range("D5").Resize(d.Count, 1).Value = Application.Transpose(d.keys)
    Application.ScreenUpdating = True
    End Sub
    Hi JoeMo,

    Thanks for the slick code! I was reading it through (trying to understand this approach & what you've done), but am stuck on the following 2 bits:

    Code:
    If Not d.exists(V(i, 1)) Then
            d.Add V(i, 1), d.Count
    End If
    Code:
    Range("D5").Resize(d.Count, 1).Value = Application.Transpose(d.keys)
    Would you be able to elaborate in layman's terms what these two bits of code are doing?

    Many Thanks!
    https://www.ExcelWTF.com - Helping you solve those WTF moments. If you want to learn VBA, head over to https://www.ExcelWTF.com/learn

  6. #6
    Board Regular Caleeco's Avatar
    Join Date
    Jan 2016
    Location
    United Kingdom
    Posts
    882
    Post Thanks / Like
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Remove Duplicates from a Range

    Quote Originally Posted by sericom View Post
    Think it should be

    ActiveSheet.Range("D" & lrow) = varItem
    Thanks Sericom. This is the first time I've used custom collections, so I'm still learning how to implement them. Your suggestion works great
    https://www.ExcelWTF.com - Helping you solve those WTF moments. If you want to learn VBA, head over to https://www.ExcelWTF.com/learn

  7. #7
    MrExcel MVP Zack Barresse's Avatar
    Join Date
    Dec 2003
    Location
    Oregon, USA
    Posts
    10,724
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Remove Duplicates from a Range

    Both a collection object (collection) and a scripting dictionary object (dictionary) provide the ability to store values and objects with string keys. Both objects allow an unlimited number of values or objects to be stored with keys and accessed and removed using those keys. Beyond that each object has certain benefits the other does not. Below is a summary of the pros and cons of each.

    Types of Data
    Both a collection and a dictionary can store any type of data besides strings including objects.

    Determining if a Key Exists
    A collection provides no easy method for checking if a key exists. A dictionary provides the Exists method. For example, to check if a key exists in a collection error handling must be used as illustrated with the sample code below.

    Dim Value As Variant
    On Error Resume Next
    Value = MyCollection("Key")
    If Err.Number = 5 Then
    ' Key does not exist
    End If
    On Error GoTo 0

    Note that not only does the scripting dictionary provide the Exists method, it also allows setting the compare method using the CompareMode property by setting it to BinaryCompare, TextCompare, or DatabaseCompare. Also note that an existing key can be changed by setting the Key item value.

    Accessing Items by Index
    A collection allows accessing items by index as well as keys. A dictionary only allows accessing by key. To access items by index the dictionary items collection has to be copied to a variant array as illustrated with the sample code below.

    Dim Values As Variant
    Values = MyDictionary.Items
    MsgBox Values(0)

    Removing Items
    A collection allows items to be removed by index and by key. A dictionary allows removal only by key.

    Performance
    For sets smaller than a few thousand items there is no perceptible difference in performance. A dictionary offers measurably faster performance with sets less than 40 or 50 thousand entries but never more than half the performance of a collection. If the collection or dictionary will be used created once and then used repeatedly for random checks for whether or not a key exists then a dictionary offers better performance with fewer than about 15,000 elements with the one caveat that with fewer than 1,000 elements there is no perceptible difference. Below are sample measures of adding and reading various set sizes.

    Adding 1,000 entries: No perceptible difference in performance.
    Adding 5,000 entries: A dictionary takes about 75% of the time that a collection does.
    Adding 25,000 entries: A dictionary takes about 80% of the time that a collection does.
    Adding 50,000 entries: Both the dictionary and collection offer about the same performance.
    Adding 100,000 entries: A collection takes about 70% of the time that a dictionary does.

    Reading 1,000 entries: No perceptible difference in performance.
    Reading 5,000 entries: A dictionary takes about 60% of the time that a collection does.
    Reading 25,000 entries: A collection takes about 80% of the time that a dictionary does.
    Reading 50,000 entries: A collection takes about 55% of the time that a dictionary does.
    Reading 100,000 entries: A collection takes about 35% of the time that a dictionary does.

    There is one very important exception to the above. If error handling is used to determine whether or not a key exists in a collection then a collection can start performing very badly in a compiled VB application. This is because capturing errors with error handling takes anywhere from four to 30 times longer in a compiled VB application than when running the same code in the debugger. So if it is necessary to check if keys exist and the application is written in VB then the scripting dictionary is by far the superior choice. This is not true with a VBA implementation as VBA is always interpreted and error handling seems to perform better in that environment.
    Regards,
    Zack Barresse
    My Book on Excel Tables
    (If you would like comments in any code, please say so.)

  8. #8
    Board Regular Caleeco's Avatar
    Join Date
    Jan 2016
    Location
    United Kingdom
    Posts
    882
    Post Thanks / Like
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Remove Duplicates from a Range

    ^^ Thank you Zack, that's a great bit of info!
    https://www.ExcelWTF.com - Helping you solve those WTF moments. If you want to learn VBA, head over to https://www.ExcelWTF.com/learn

  9. #9
    MrExcel MVP Zack Barresse's Avatar
    Join Date
    Dec 2003
    Location
    Oregon, USA
    Posts
    10,724
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Remove Duplicates from a Range

    Very welcome. In general I prefer Collections, but mostly because they're native and I don't need an external reference call. Largely it just comes down to preference.

    As a side note, a wise programmer once taught me that it ends up being more expensive (in terms of efficiency and CPU cost) to declare the variable as New, as opposed to declaring it as New when setting it. I've always been in the habit ever since.
    Regards,
    Zack Barresse
    My Book on Excel Tables
    (If you would like comments in any code, please say so.)

  10. #10
    MrExcel MVP
    Join Date
    May 2009
    Posts
    16,379
    Post Thanks / Like
    Mentioned
    36 Post(s)
    Tagged
    8 Thread(s)

    Default Re: Remove Duplicates from a Range

    Quote Originally Posted by Caleeco View Post
    Hi JoeMo,

    Thanks for the slick code! I was reading it through (trying to understand this approach & what you've done), but am stuck on the following 2 bits:

    Code:
    If Not d.exists(V(i, 1)) Then
            d.Add V(i, 1), d.Count
    End If
    Code:
    Range("D5").Resize(d.Count, 1).Value = Application.Transpose(d.keys)
    Would you be able to elaborate in layman's terms what these two bits of code are doing?

    Many Thanks!
    I see Zack has posted extensive info on collections and dictionaries, but in case you have not extracted answers to your questions from Zack's post, here are simple explanations.

    The first bit of code uses the Exists method to see if the value being tested is already in the dictionary. If so, if is skipped (so a duplicate is ignored). If not, it added to the dictionary as a key (V(i,1)) and a companion Item, in this case just a sequential number, d.count that reveals how many entries are in the dictionary thus far.

    The second bit writes the unique values (i.e. those in the dictionary) to the desired range by transposing the keys (which are contained in the horizontal array d.keys so must be transposed as you want them written to a vertical range). The size of the range is just the number of cells required to hold all the keys in the dictionary which is exactly d.count.
    Joe

    When I was a young man I knew everything. Now that I'm older, I realize I know very little, and what I do know, I tend to forget!

Some videos you may like

User Tag List

Tags for this Thread

Like this thread? Share it with others

Like this thread? Share it with others

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •