Remove Duplicates from a Range

Caleeco

Well-known Member
Joined
Jan 9, 2016
Messages
980
Office Version
  1. 2010
Platform
  1. Windows
Hello,

I am attempting to remove duplicates from a set range. However, I need it to be only the range defined and not the entire column.

I read an example online whereby a unique list is created by adding each row of data to a collection (resume next on errors).

My code below correctly identifies the number of unique values in the data set stored in range A1:A7.

However, I cant work out how to then paste each item of the collection in cell D5 onwards (D6, D7, D8, etc)

Any help is much appreciated!

Code:
Sub RemoveDuplicate()


Dim NoDuplicates As New Collection
Dim rData As Range
Dim Cell As Range


Dim lrow As Long
Dim Item As Object


Set rData = ActiveSheet.Range("A1", Range("A" & Rows.Count).End(xlUp))


On Error Resume Next
For Each Cell In rData
    NoDuplicates.Add Cell.Value, CStr(Cell.Value)
Next


MsgBox "Number of Unique Values: " & NoDuplicates.Count


lrow = 5
For Each Item In NoDuplicates
    ActiveSheet.Range("D" & lrow) = Item.Value
    lrow = lrow + 1
Next Item


End Sub
 
Last edited:

Excel Facts

Which Excel functions can ignore hidden rows?
The SUBTOTAL and AGGREGATE functions ignore hidden rows. AGGREGATE can also exclude error cells and more.
Ok, got it working after some research. Needed to define the items of the collection as VARIANT. Also I wasn't calling the data correctly. The code below does what I want, if anyone knows a better way to do it, I'm open to suggestions!

Code:
Sub RemoveDuplicate()


Dim NoDuplicates As New Collection
Dim rData As Range
Dim Cell As Range


Dim lrow As Long
Dim varItem As Variant


Set rData = ActiveSheet.Range("A1", Range("A" & Rows.Count).End(xlUp))


On Error Resume Next
For Each Cell In rData
    NoDuplicates.Add Cell.Value, CStr(Cell.Value)
Next


MsgBox "Number of Unique Values: " & NoDuplicates.Count


lrow = 5
For Each varItem In NoDuplicates
    ActiveSheet.Range("D" & lrow) = NoDuplicates(varItem)
    lrow = lrow + 1
Next varItem


End Sub
 
Upvote 0
Ok, got it working after some research. Needed to define the items of the collection as VARIANT. Also I wasn't calling the data correctly. The code below does what I want, if anyone knows a better way to do it, I'm open to suggestions!

Code:
Sub RemoveDuplicate()


Dim NoDuplicates As New Collection
Dim rData As Range
Dim Cell As Range


Dim lrow As Long
Dim varItem As Variant


Set rData = ActiveSheet.Range("A1", Range("A" & Rows.Count).End(xlUp))


On Error Resume Next
For Each Cell In rData
    NoDuplicates.Add Cell.Value, CStr(Cell.Value)
Next


MsgBox "Number of Unique Values: " & NoDuplicates.Count


lrow = 5
For Each varItem In NoDuplicates
    ActiveSheet.Range("D" & lrow) = NoDuplicates(varItem)
    lrow = lrow + 1
Next varItem


End Sub
I prefer using a dictionary object over collections. The code below uses a dictionary to extract the unique values from column A and lists them starting in D5.
Code:
Sub RemoveDupsWithDictionary()
Dim dataRng As Range, V As Variant
Dim d As Object
Set dataRng = Range("A1:A" & Cells(Rows.Count, "A").End(xlUp).Row)
V = dataRng.Value  'places values in data range into an array for faster processing
Set d = CreateObject("Scripting.dictionary")
For i = 1 To UBound(V, 1)
    If Not d.exists(V(i, 1)) Then
        d.Add V(i, 1), d.Count
    End If
Next i
Application.ScreenUpdating = False
Range("D5").Resize(d.Count, 1).Value = Application.Transpose(d.keys)
Application.ScreenUpdating = True
End Sub
 
Upvote 0
I prefer using a dictionary object over collections. The code below uses a dictionary to extract the unique values from column A and lists them starting in D5.
Code:
Sub RemoveDupsWithDictionary()
Dim dataRng As Range, V As Variant
Dim d As Object
Set dataRng = Range("A1:A" & Cells(Rows.Count, "A").End(xlUp).Row)
V = dataRng.Value  'places values in data range into an array for faster processing
Set d = CreateObject("Scripting.dictionary")
For i = 1 To UBound(V, 1)
    If Not d.exists(V(i, 1)) Then
        d.Add V(i, 1), d.Count
    End If
Next i
Application.ScreenUpdating = False
Range("D5").Resize(d.Count, 1).Value = Application.Transpose(d.keys)
Application.ScreenUpdating = True
End Sub

Hi JoeMo,

Thanks for the slick code! I was reading it through (trying to understand this approach & what you've done), but am stuck on the following 2 bits:

Code:
If Not d.exists(V(i, 1)) Then
        d.Add V(i, 1), d.Count
End If

Code:
Range("D5").Resize(d.Count, 1).Value = Application.Transpose(d.keys)

Would you be able to elaborate in layman's terms what these two bits of code are doing?

Many Thanks!
 
Upvote 0
Think it should be

ActiveSheet.Range("D" & lrow) = varItem

Thanks Sericom. This is the first time I've used custom collections, so I'm still learning how to implement them. Your suggestion works great :)
 
Upvote 0
Both a collection object (collection) and a scripting dictionary object (dictionary) provide the ability to store values and objects with string keys. Both objects allow an unlimited number of values or objects to be stored with keys and accessed and removed using those keys. Beyond that each object has certain benefits the other does not. Below is a summary of the pros and cons of each.

Types of Data
Both a collection and a dictionary can store any type of data besides strings including objects.

Determining if a Key Exists
A collection provides no easy method for checking if a key exists. A dictionary provides the Exists method. For example, to check if a key exists in a collection error handling must be used as illustrated with the sample code below.

Dim Value As Variant
On Error Resume Next
Value = MyCollection("Key")
If Err.Number = 5 Then
' Key does not exist
End If
On Error GoTo 0

Note that not only does the scripting dictionary provide the Exists method, it also allows setting the compare method using the CompareMode property by setting it to BinaryCompare, TextCompare, or DatabaseCompare. Also note that an existing key can be changed by setting the Key item value.

Accessing Items by Index
A collection allows accessing items by index as well as keys. A dictionary only allows accessing by key. To access items by index the dictionary items collection has to be copied to a variant array as illustrated with the sample code below.

Dim Values As Variant
Values = MyDictionary.Items
MsgBox Values(0)

Removing Items
A collection allows items to be removed by index and by key. A dictionary allows removal only by key.

Performance
For sets smaller than a few thousand items there is no perceptible difference in performance. A dictionary offers measurably faster performance with sets less than 40 or 50 thousand entries but never more than half the performance of a collection. If the collection or dictionary will be used created once and then used repeatedly for random checks for whether or not a key exists then a dictionary offers better performance with fewer than about 15,000 elements with the one caveat that with fewer than 1,000 elements there is no perceptible difference. Below are sample measures of adding and reading various set sizes.

Adding 1,000 entries: No perceptible difference in performance.
Adding 5,000 entries: A dictionary takes about 75% of the time that a collection does.
Adding 25,000 entries: A dictionary takes about 80% of the time that a collection does.
Adding 50,000 entries: Both the dictionary and collection offer about the same performance.
Adding 100,000 entries: A collection takes about 70% of the time that a dictionary does.

Reading 1,000 entries: No perceptible difference in performance.
Reading 5,000 entries: A dictionary takes about 60% of the time that a collection does.
Reading 25,000 entries: A collection takes about 80% of the time that a dictionary does.
Reading 50,000 entries: A collection takes about 55% of the time that a dictionary does.
Reading 100,000 entries: A collection takes about 35% of the time that a dictionary does.

There is one very important exception to the above. If error handling is used to determine whether or not a key exists in a collection then a collection can start performing very badly in a compiled VB application. This is because capturing errors with error handling takes anywhere from four to 30 times longer in a compiled VB application than when running the same code in the debugger. So if it is necessary to check if keys exist and the application is written in VB then the scripting dictionary is by far the superior choice. This is not true with a VBA implementation as VBA is always interpreted and error handling seems to perform better in that environment.
 
Upvote 0
^^ Thank you Zack, that's a great bit of info!
 
Upvote 0
Very welcome. In general I prefer Collections, but mostly because they're native and I don't need an external reference call. Largely it just comes down to preference.

As a side note, a wise programmer once taught me that it ends up being more expensive (in terms of efficiency and CPU cost) to declare the variable as New, as opposed to declaring it as New when setting it. I've always been in the habit ever since.
 
Upvote 0
Hi JoeMo,

Thanks for the slick code! I was reading it through (trying to understand this approach & what you've done), but am stuck on the following 2 bits:

Code:
If Not d.exists(V(i, 1)) Then
        d.Add V(i, 1), d.Count
End If

Code:
Range("D5").Resize(d.Count, 1).Value = Application.Transpose(d.keys)

Would you be able to elaborate in layman's terms what these two bits of code are doing?

Many Thanks!
I see Zack has posted extensive info on collections and dictionaries, but in case you have not extracted answers to your questions from Zack's post, here are simple explanations.

The first bit of code uses the Exists method to see if the value being tested is already in the dictionary. If so, if is skipped (so a duplicate is ignored). If not, it added to the dictionary as a key (V(i,1)) and a companion Item, in this case just a sequential number, d.count that reveals how many entries are in the dictionary thus far.

The second bit writes the unique values (i.e. those in the dictionary) to the desired range by transposing the keys (which are contained in the horizontal array d.keys so must be transposed as you want them written to a vertical range). The size of the range is just the number of cells required to hold all the keys in the dictionary which is exactly d.count.
 
Upvote 0

Forum statistics

Threads
1,214,566
Messages
6,120,257
Members
448,952
Latest member
kjurney

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top