Delete Duplicate Rows

Mark F

Well-known Member
Joined
Jun 7, 2002
Messages
513
Office Version
  1. 365
Platform
  1. Windows
Hi Guys

I have tried to search Mr Excel and can't quite find what I am looking for. I have tried a few examples that haven't worked for whatever reason - me probably

I have a sheet that has around 58000 rows with 5 columns.

I need to delete all of the duplicates of :oops: rows where there is a duplication in column C - in other words leaving just one entry of that row

This is a one off job so the number of rows is fixed as such

There are no blanks in any other 5 columns

Thanks for any help

Mark[/img]
 

Excel Facts

Can you sort left to right?
To sort left-to-right, use the Sort dialog box. Click Options. Choose "Sort left to right"
If a 1-time only shot, then via formula instead of macro --

1] in the 6th column, enter in the 1st row [ assumes row #1 ] =COUNTIF(F$1:F1,C1)
2] Copy down column F as far as needed.
3] hit F9 to ensure calculation.
4] sort columns A thru F by F
5] delete all rows here F > 1
6] delete column F

ALWAYS test on a back-up.
 
Upvote 0
This should also work, note however, as MyRng becomes very large, setting the union becomes bollox slow:

<font face=Courier New><SPAN style="color:darkblue">Sub</SPAN> Dude_Yer_Gittin_A_Del()
<SPAN style="color:darkblue">Dim</SPAN> Uni <SPAN style="color:darkblue">As</SPAN> <SPAN style="color:darkblue">New</SPAN> Collection, cl <SPAN style="color:darkblue">As</SPAN> Range, myRng <SPAN style="color:darkblue">As</SPAN> Range
Application.ScreenUpdating = <SPAN style="color:darkblue">False</SPAN>
<SPAN style="color:darkblue">On</SPAN> <SPAN style="color:darkblue">Error</SPAN> <SPAN style="color:darkblue">Resume</SPAN> <SPAN style="color:darkblue">Next</SPAN>

<SPAN style="color:darkblue">For</SPAN> <SPAN style="color:darkblue">Each</SPAN> cl <SPAN style="color:darkblue">In</SPAN> Range([c1], [c65536].End(3))
    Uni.Add cl.Value, <SPAN style="color:darkblue">CStr</SPAN>(cl.Value)
    <SPAN style="color:darkblue">If</SPAN> Err.Number <> 0 <SPAN style="color:darkblue">Then</SPAN>
        Err.Clear
        <SPAN style="color:darkblue">If</SPAN> myRng <SPAN style="color:darkblue">Is</SPAN> <SPAN style="color:darkblue">Nothing</SPAN> <SPAN style="color:darkblue">Then</SPAN> <SPAN style="color:darkblue">Set</SPAN> myRng = cl Else _
            <SPAN style="color:darkblue">Set</SPAN> myRng = Union(cl, myRng)
    <SPAN style="color:darkblue">End</SPAN> <SPAN style="color:darkblue">If</SPAN>
<SPAN style="color:darkblue">Next</SPAN> cl

<SPAN style="color:darkblue">Set</SPAN> Uni = <SPAN style="color:darkblue">Nothing</SPAN>
myRng.EntireRow.Delete

Application.ScreenUpdating = <SPAN style="color:darkblue">True</SPAN>
<SPAN style="color:darkblue">End</SPAN> <SPAN style="color:darkblue">Sub</SPAN></FONT>

I didn't necessarily want to write this as such, but a different line of logic, which seemed much better, wasn't working as I expected it to...
 
Upvote 0
Just Jon / Nate

Jon's version worked well on a smaller edited version, and gave the number of entries, but my system locked when I tried on the 58000 lines

I will continue to try both options tomorrow to see what happens

I guess I will have to be patient considering the number of lines to deal with - not easy!!

Thanks for your help
 
Upvote 0
You are welcome. :)

That is a lot of data... Another thought is to append your data into an Access Table, and run a Select Distinct sql statement against it. This would probably be a lot faster (like a few seconds).
 
Upvote 0
Try this code.

Please Note: Dont forget to take backup of your data:

Sub deletingduplicate1()

Dim unique As Range


'Set unique = Range(ActiveCell.CurrentRegion)' if you have only one column.
Set unique = Range("c1:c58000") ' presuming you have data from c1 to c5800

unique.sort key1:=ActiveCell

' above line of code to check unique only

unique.Select
ActiveCell.Activate

'Debug.Print ActiveCell.Address

Do While Not IsEmpty(ActiveCell)
If ActiveCell.Value = ActiveCell.Offset(-1, 0).Value Then
GoTo Label1
Else
ActiveCell.Offset(1, 0).Select
Label1:
If ActiveCell.Value = ActiveCell.Offset(-1, 0).Value Then
'If ActiveCell.Value = ActiveCell.Offset(-1, 0).Value And ActiveCell.Offset(0, 1).Value = ActiveCell.Offset(-1, 1).Value Then
ActiveCell.EntireRow.Delete
End If
End If
Loop

End Sub

GO to VBE:

1.by pressing ALT + F11 key or you can go by Tools -> Macror -> VBE Editor.

2.Here you will see VBA PROJECTS
3.Here right click on in and then insert -> MODULE

Here paste this code.

see if it works.. and do post how much time it took.
 
Upvote 0
nisht - - Why would you want to activate and loop through 58,000 cells, each with an If condition? Yukk.

Black Death said:
I have a sheet that has around 58000 rows with 5 columns.I need to delete all of the duplicates of rows where there is a duplication in column C - in other words leaving just one entry of that row
This is a one off job so the number of rows is fixed as such
On the face of it, based on your description, try this...it works in seconds and you don't need a macro. Since you are going to delete a lot of records anyway, just advance filter column C for unique values, and copy the visible rows to a new sheet.

Step 1
Create a new sheet (Insert > Worksheet).

Step 2
Go to your original sheet, and click on the "C" column header to select column C.

Step 3
Click on Data > Filter > Advance filter > select "Filter the list in place", enter $C:$C in the List range field, leave the Criteria range field blank, put a checkmark in the box next to Unique records only, and click OK.

Step 4
Click on the Row 1 header to select all of row 1. Then press Shift+Ctrl+Arrow Down to select all rows, then press Alt+; (that's the Alt key and the semicolon key) to select all visible rows.

Step 5
Click Edit > Copy, then go to the sheet you created in Step1, click in cell A1, and press Ctrl+V or Edit > Paste. Hit the Esc key to exit copy mode.



The macro to do that would be something like:

Sub Test1()
Application.ScreenUpdating = False
With Columns(3)
.AdvancedFilter Action:=xlFilterInPlace, Unique:=True
.EntireRow.SpecialCells(xlCellTypeVisible).Copy
End With
Sheets.Add(After:=Sheets(Sheets.Count)).Name = "MyNewSheet"
Range("A1").Select
ActiveSheet.Paste
Application.CutCopyMode = False
Application.ScreenUpdating = True
End Sub
 
Upvote 0
Tom,
Thank you for good suggestion. I also show the link of yogi where i suggested countif function to locate the unique values.
In that case i want to say that you can filter the data with zero (actually i did not mentioned that on the post presuming that poster will find to use Advance criteria range)
.. and yogi's excellent example showing the use of advance filter function with formula ..which i believe will give answere to lots of query related to duplicate value in column.

code i pasted here was to get some input from Black Death for the time it takes to delete the duplicate values in cells in a column. ( I posted this code before but i failed to get the input.)

actually the code will not check all 58000 values because it will delete duplicate values.

i looped through empty cells ..65000 of them and it took 10 seconds.. now wait for Black Death.. to post how much time will actually it take to perform by using this code.

I will be obliged if he does that.

In my code I am not using union function.. as nate has used.. so thought to post my code. because union range is not fast thats what I presume.
 
Upvote 0
Guys

Thanks for your help

I tried Nates and Nishts versions without a great deal of success - my system appeared to crash and the file was autosaved.

Sorry that I can't give Nisht the info wanted on time taken for his version

Toms version , both the manual and the code version did exactly what I needed it to.

Thankyou again

Mark
 
Upvote 0

Forum statistics

Threads
1,215,327
Messages
6,124,280
Members
449,149
Latest member
mwdbActuary

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top