Difficulty deleting duplicates

hzarry

New Member
Joined
Aug 21, 2015
Messages
18
I'm having some problems with this dataset below and don't know what to do.

For each group of of common ids I want to delete their respective duplicates based on their status and their date and time stamp. If the status for all the common trade ids is "rejected" then I want to delete all trade id except the latest one submitted.

If the common trade ids all have statuses of "Accepted" I want to keep the earliest instance of this ID and delete all those ones that came after.

In all other cases, for instance if a group of common IDs have statuses of "Accepted" and "Rejected" then i want to keep the last submission and delete all other common trades.

The trades in bold below are the ones i want to keep. Any help would be greatly appreciated!



Submission Date</SPAN>Time Stamp</SPAN>ID</SPAN>Status</SPAN>
01/06/2015</SPAN>17:36:10</SPAN>116082XK</SPAN>Rejected</SPAN>
03/06/2015</SPAN>00:10:26</SPAN>116082XK</SPAN>Rejected</SPAN>
11/06/2015</SPAN>00:16:57</SPAN>116082XK</SPAN>Rejected</SPAN>
11/06/2015</SPAN>06:18:05</SPAN>116082XK</SPAN>Rejected</SPAN>
14/08/2015</SPAN>15:36:58</SPAN>116082XK</SPAN>Rejected</SPAN>
01/06/2015</SPAN>17:34:01</SPAN>117311XK</SPAN>Accepted</SPAN>
02/06/2015</SPAN>00:15:25</SPAN>117311XK</SPAN>Accepted</SPAN>
04/06/2015</SPAN>00:15:13</SPAN>117311XK</SPAN>Accepted</SPAN>
11/06/2015</SPAN>00:16:57</SPAN>117311XK</SPAN>Accepted</SPAN>
11/06/2015</SPAN>06:18:32</SPAN>117311XK</SPAN>Accepted</SPAN>
05/08/2015</SPAN>15:54:45</SPAN>117311XK</SPAN>Accepted</SPAN>
06/08/2015</SPAN>16:38:13</SPAN>117311XK</SPAN>Accepted</SPAN>
14/08/2015</SPAN>15:38:34</SPAN>117311XK</SPAN>Accepted</SPAN>
01/06/2015</SPAN>17:33:20</SPAN>118202XK</SPAN>Rejected</SPAN>
02/06/2015</SPAN>00:15:25</SPAN>118202XK</SPAN>Rejected</SPAN>
03/06/2015</SPAN>00:10:20</SPAN>118202XK</SPAN>Rejected</SPAN>
11/06/2015</SPAN>00:16:58</SPAN>118202XK</SPAN>Rejected</SPAN>
11/06/2015</SPAN>06:18:30</SPAN>118202XK</SPAN>Rejected</SPAN>
14/08/2015</SPAN>14:01:41</SPAN>118202XK</SPAN>Accepted</SPAN>
14/08/2015</SPAN>15:37:47</SPAN>118202XK</SPAN>Rejected</SPAN>
01/06/2015</SPAN>17:32:38</SPAN>110711XK</SPAN>Rejected</SPAN>
03/06/2015</SPAN>00:10:29</SPAN>110711XK</SPAN>Rejected</SPAN>
04/06/2015</SPAN>00:15:13</SPAN>110711XK</SPAN>Rejected</SPAN>
11/06/2015</SPAN>00:16:58</SPAN>110711XK</SPAN>Rejected</SPAN>
17/06/2015</SPAN>06:19:17</SPAN>110711XK</SPAN>Rejected</SPAN>
05/08/2015</SPAN>09:52:24</SPAN>110711XK</SPAN>Rejected</SPAN>
05/08/2015</SPAN>15:24:22</SPAN>110711XK</SPAN>Accepted</SPAN>

<TBODY>
</TBODY><COLGROUP><COL><COL><COL><COL></COLGROUP>
 

Excel Facts

Excel Joke
Why can't spreadsheets drive cars? They crash too often!
This is probably an over complicated solution but here goes. Assuming your data is in cells A1:D28, input this in cell E2 and apply it to the rest of the column:
=IF(AND(COUNTIFS(C2:C28,C2,D2:D28,"Rejected")>0,(COUNTIFS(C2:C28,C2,D2:D28,"Accepted")=0)),IF(COUNTIFS(C2:C28,C2,D2:D28,"Rejected")=1,"Keep","Delete"),IF(AND(COUNTIFS(C2:C28,C2,D2:D28,"Rejected")=0,(COUNTIFS(C2:C28,C2,D2:D28,"Accepted")>0)),IF(COUNTIFS(C2:C28,C2,D2:D28,"Accepted")=MAX(COUNTIFS(C$2:C28,C2,D$2:D28,"Accepted")),"Keep","Delete"),IF(AND(COUNTIFS(C2:C28,C2,D2:D28,"Rejected")>0,(COUNTIFS(C2:C28,C2,D2:D28,"Accepted")>0)),IF(COUNTIF(C2:C28,C2)=1,"Keep","Delete"))))
This will return either a "Keep" or a "Delete" result. By filtering on column E, you can simply manually delete the lines that have "Delete" in column E
Simply adapt this to your actual ranges but not full column ranges (keep the C2:C28 format and not C:C). Also, make sure not to lock the cells except where it's bolded in the formula.
 
Upvote 0
Thanks! This worked perfectly on the sample data I gave! How would I adjust this formula so that it can cater for rows of different numbers and not just 2-28. I am trying to build a macro with this formula incorparated into it but need it in such a way so that the range is C:C as opposed to C2:C28.

Thanks again
 
Upvote 0
To be honest, I don't know how you would adapt the formula in a macro. As it stands now, the formula is doing countifs to determine which line to keep. As you apply it to a different row, the range changes as well (in row 2 the reference is C2:C:28 but in row4 the reference becomes C4:C30). This is important in determining which line is the first or last line that needs to be kept. You can always extend the range to C2:C5000 or further in the formula but as far as macros go I am a noob.
 
Upvote 0

Forum statistics

Threads
1,215,518
Messages
6,125,291
Members
449,218
Latest member
Excel Master

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top