Eliminate Duplicates with Certain Business Rules

boldcode

Active Member
Joined
Mar 12, 2010
Messages
347
Hello,


I am fairly new to VBA and I trying to create code behind a command button in excel that will get rid of duplicates based on some business rules.



Below I explain the situation at hand.


I have the following 3 columns in my dataset (these are the actual names of the columns):


Under Column A the name of the column is: Announcement Number
Under Column B the name of the column is: Issue Date
Under Column C the name of the column is: Certificate Returned to HR (Date)


Example of dataset:

<table style="border-collapse: collapse; width: 522px; height: 240px;" border="0" cellpadding="0" cellspacing="0"> <col style="width: 36pt;" width="48"> <col style="width: 120pt;" width="160"> <col style="width: 56pt;" width="75"> <col style="width: 162pt;" width="216"> <tbody><tr style="height: 15pt;" height="20"> <td class="oa1" style="height: 15pt; width: 36pt;" width="48" height="20">
</td> <td class="oa2" style="width: 120pt;" width="160">
Column A
</td> <td class="oa2" style="width: 56pt;" width="75">
Column B
</td> <td class="oa2" style="width: 162pt;" width="216">
Column C
</td> </tr> <tr style="height: 15pt;" height="20"> <td class="oa3" style="height: 15pt; width: 36pt;" width="48" height="20">
row 1
</td> <td class="oa2" style="width: 120pt;" width="160">
Announcement Number
</td> <td class="oa2" style="width: 56pt;" width="75">
Issue Date
</td> <td class="oa2" style="width: 162pt;" width="216">
Certificate Returned to HR (Date)
</td> </tr> <tr style="height: 15pt;" height="20"> <td class="oa3" style="height: 15pt; width: 36pt;" width="48" height="20">
row 2
</td> <td class="oa2" style="width: 120pt;" width="160">
1003098SQMP
</td> <td class="oa2" style="width: 56pt;" width="75">
</td> <td class="oa2" style="width: 162pt;" width="216">
</td> </tr> <tr style="height: 15pt;" height="20"> <td class="oa3" style="height: 15pt; width: 36pt;" width="48" height="20">
row 3
</td> <td class="oa2" style="width: 120pt;" width="160">
1003098SQMP
</td> <td class="oa3" style="width: 56pt;" width="75">
10/28/2010
</td> <td class="oa3" style="width: 162pt;" width="216">
10/28/2010
</td> </tr> <tr style="height: 15pt;" height="20"> <td class="oa4" style="height: 15pt; width: 36pt;" width="48" height="20">
row 4
</td> <td class="oa5" style="width: 120pt;" width="160">
1103001MCMP
</td> <td class="oa4" style="width: 56pt;" width="75">
11/3/2010
</td> <td class="oa4" style="width: 162pt;" width="216">
12/16/2010
</td> </tr> <tr style="height: 15pt;" height="20"> <td class="oa4" style="height: 15pt; width: 36pt;" width="48" height="20">
row 5
</td> <td class="oa5" style="width: 120pt;" width="160">
1103001MCMP
</td> <td class="oa4" style="width: 56pt;" width="75">
11/3/2010
</td> <td class="oa5" style="width: 162pt;" width="216">
</td> </tr> <tr style="height: 15pt;" height="20"> <td class="oa4" style="height: 15pt; width: 36pt;" width="48" height="20">
row 6
</td> <td class="oa5" style="width: 120pt;" width="160">
1103001MCMP
</td> <td class="oa5" style="width: 56pt;" width="75">
</td> <td class="oa5" style="width: 162pt;" width="216">
</td> </tr> <tr style="height: 15pt;" height="20"> <td class="oa3" style="height: 15pt; width: 36pt;" width="48" height="20">
row 7
</td> <td class="oa2" style="width: 120pt;" width="160">
1103002JPMP
</td> <td class="oa2" style="width: 56pt;" width="75">
</td> <td class="oa2" style="width: 162pt;" width="216">
</td> </tr> <tr style="height: 15pt;" height="20"> <td class="oa3" style="height: 15pt; width: 36pt;" width="48" height="20">
row 8
</td> <td class="oa2" style="width: 120pt;" width="160">
1103003KSMP
</td> <td class="oa2" style="width: 56pt;" width="75">
</td> <td class="oa2" style="width: 162pt;" width="216">
</td> </tr> <tr style="height: 15pt;" height="20"> <td class="oa3" style="height: 15pt; width: 36pt;" width="48" height="20">
row 9
</td> <td class="oa2" style="width: 120pt;" width="160">
1103003KSMP
</td> <td class="oa3" style="width: 56pt;" width="75">
11/8/2010
</td> <td class="oa3" style="width: 162pt;" width="216">
12/8/2010
</td> </tr> <tr style="height: 15pt;" height="20"> <td class="oa3" style="height: 15pt; width: 36pt;" width="48" height="20">
row 10
</td> <td class="oa2" style="width: 120pt;" width="160">
1103003KSMP
</td> <td class="oa3" style="width: 56pt;" width="75">
11/8/2010
</td> <td class="oa2" style="width: 162pt;" width="216">
</td> </tr> <tr style="height: 15pt;" height="20"> <td class="oa3" style="height: 15pt; width: 36pt;" width="48" height="20">
row 11
</td> <td class="oa2" style="width: 120pt;" width="160">
1103005MCMP
</td> <td class="oa2" style="width: 56pt;" width="75">
</td> <td class="oa2" style="width: 162pt;" width="216">
</td> </tr> </tbody></table>


As you can see from the data the Announcement Number at times tends to repeat itself. For example, let’s look at rows 4, 5, 6. These rows all have the same Announcement Number, but might or might not have dates in the appropriate cells under columns B and C.



What I want the VBA code do for the specific Announcement Number (in this case 1103001MCMP) is to keep only the record where BOTH THE Issue Date and Certificate Returned to HR (Date) are GREATER than the other pair of dates with the same Announcement Number. In this case the winner and the record we would keep is row 4 and the other two would be deleted. This part is pretty straight forward but the CATCH is that this is only one scenario.


Next I outline all possible scenarios that I need translated into VBA code behind a command button. One thing before I get into the scenarios, when I import this data into excel I might have 15 rows or 500 rows it just depends on the data extract.


So below are the 3 business rules:


1. 1. If the Announcement Number is the same, then look at both columns B and C to see which PAIR of dates are greater than the other pair of dates in the other records with same Announcement Number and keep that record with the pair of dates that are greater and delete the rest of the records (IN THE CASE OF ROWS 9 AND 10 THE WINNER WOULD BE ROW 9 and in the case of rows 2 and 3 the winner would be row 2 and in the case of rows 4,5,6 the winner is row 4)


2. 2. If the Announcement Number is the same, but no dates exist for column C then look at the dates in column B and keep the greater of the two and delete the rest of the records


3. 3. If the Announcement Number is not a duplicate then obviously keep the record.


So in the dataset above would be reduced rows 2, 4, 9, 11 and all the other would be deleted.

Thanks in Advance
 

Excel Facts

Is there a shortcut key for strikethrough?
Ctrl+S is used for Save. Ctrl+5 is used for Strikethrough. Why Ctrl+5? When you use hashmarks to count |||| is 4, strike through to mean 5.

Forum statistics

Threads
1,224,525
Messages
6,179,317
Members
452,905
Latest member
deadwings

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top