Remove duplicates from an array

L. Howard

Well-known Member
Joined
Oct 16, 2012
Messages
4,514
All the code preceding this line work just fine, and this line of code dumps myArr into column A2 of Sheet("A Name of Sheet") and down 12,000 rows.

Of those 12,000 entries, there are 0 to 7 duplicates on any give execution of the code.

How can I remove those duplicates, if they do occur from myArr before dumping into column A?

Code:
 myArr = .Range("A2:A" & LRow)

Thanks.
Howard
 
Hi Rick and aminex,

I have run several more test with the code I posted I post #9 it appears I was overlooking the results incorrectly.

I believe it is working exactly the way you suggested, Rick.

Thanks for the help.

Regards,
Howard
 
Upvote 0

Excel Facts

Last used cell?
Press Ctrl+End to move to what Excel thinks is the last used cell.
Howard,

Just for your interest, and to possibly help you avoid wrong results, you might try running your remove duplicates code on some randomized test data, say as given by the following code
Code:
Sub testdata()
With Cells(1).Resize(100)
    .Cells = "=if(rand()<.7,randbetween(1,20),randbetween(10,20)&char(99))"
    .Resize(1) = "Header"
    .Value = .Value
End With
End Sub
Try it a couple of times on a blank worksheet. Remove any duplicates with RemoveDuplicates (say as in Post#4), and see if you're happy with the result.
Can easily check by sorting to visually see if all duplicates are indeed removed.
 
Upvote 0
Hi mirabeau,

A good little exercise and it did verified the removal of duplicates.

Thanks,
Howard
 
Upvote 0
Hi mirabeau,

A good little exercise and it did verified the removal of duplicates.

Thanks,
Howard
OK. Good that you're happy with that.

With my 2007 Excel, sometimes the RemoveDuplicates works with that test data, but frequently it doesn't.

Perhaps it was upgraded between my 2007 and your 2010.

But I'd like to think you tested at least several times before concluding final verification. The Excel RemoveDuplicates bug is referred to quite often on Google.
 
Upvote 0
Okay, that is good to know.

But if one uses code to remove duplicates, is that different than the "Remove Duplicates" from the ribbon Data > Remove Duplicates?

Or does the VBA code evoke the same "removal soldiers" to do the job as the ribbon does?

And that brings to mind on one such test it failed to remove one of two 6's that were clearly in the list. The column also seemed to be miss sorted as all the integers except one of the 6's was down among the alpha numeric entries.

I did a =A4=A16 on it and it returned TRUE so both were really 6's.

Is this what you are warning me about?

Howard
 
Upvote 0
Okay, that is good to know.

But if one uses code to remove duplicates, is that different than the "Remove Duplicates" from the ribbon Data > Remove Duplicates?

Or does the VBA code evoke the same "removal soldiers" to do the job as the ribbon does?

And that brings to mind on one such test it failed to remove one of two 6's that were clearly in the list. The column also seemed to be miss sorted as all the integers except one of the 6's was down among the alpha numeric entries.

I did a =A4=A16 on it and it returned TRUE so both were really 6's.

Is this what you are warning me about?

Howard
The ribbon RemoveDuplicates and the VBA RemoveDuplicates should give the same result.

I guess I was suggesting that you test the VBA version on some problematical data to see if it does in fact remove all duplicates, as advertised. Because it seemed to me this is what you wanted.

If you're happy with what you have then that's great.

But with my version (2007) of Excel it doesn't.

If you examine that testdata code I posted above you should note that (apart from the header) there's only 31 unique values in the list.

Try running the testdata code up to 10,000 (by changing the first line), then run the RemoveDuplicates code and see if you can get only the 31 unique values remaining.

My Excel won't do it. My conclusion is that (at least my 2007 version of) RemoveDuplicates doesn't always remove duplicates. So, personally, I like to use other approaches to do the job.
 
Upvote 0
It is a good idea to incorporate TRIM when or before using remove dulpicates. Especially if the data has been imported from an outside source. Leading or trailing spaces can make it appear that duplicates are not being removed. [ 6<space>], <space>[6 ], [ <space>6 ]<space> & [6] are read as 4 different values by RemoveDuplicate so all would be left after running.</space></space></space></space>
 
Last edited:
Upvote 0
Hi LuKate,

The imported stuff is always suspect for those things you mention that's for sure.

I suspect there is some imported data here in my situation. In this project, common words are listed in six columns where a word from each column is selected from left to right and those words form a short phrase. The words from each column are randomly selected so each phrase is different but not without the possibility of duplicates.

For reasons I cannot explain here, the penalty for duplication is somewhat severe. So I am trying to minimize risk of penalty.

With the tests suggested by mirabeau, I am finding the removal of dupes is not all that straight forward. The greater the number of entries the less reliable it seems to be.

Code seems to be taking care of dupes when run on 12,000 short phrases. However, that many is nearly impossible to visually verify. With the same words forming the same phrase, one could two phrases that look the same with an extra space in one phrase. That would not be considered a dupe to the code but would be a dupe to the eye when presented for end use.

Probably going to have to assume some risk and deal with it as best I can.

Howard
 
Upvote 0
It is a good idea to incorporate TRIM when or before using remove dulpicates. Especially if the data has been imported from an outside source. Leading or trailing spaces can make it appear that duplicates are not being removed. [ 6<space>], <space>[6 ], [ <space>6 ]<space> & [6] are read as 4 different values by RemoveDuplicate so all would be left after running.</space></space></space></space>
Hi LuKate,

Would you like to generate some test data with the code in Post#12 (preferably with the 100 on second line replaced by 1000) , and show how/if using Trim can thereby enable the Excel RemoveDuplicates to remove duplicates?
 
Upvote 0

Forum statistics

Threads
1,215,632
Messages
6,125,908
Members
449,273
Latest member
mrcsbenson

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top