Extracting Un-Matched Data from 30000 lines!

I heart Excel

Board Regular
Joined
Feb 28, 2011
Messages
66
Hello :)

I have two systems with "supposedly" the same information!! One feeds into the other, unfortunately we still have a few problems which is enabling users to manually amend.<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /><o:p></o:p>
<o:p></o:p>
Once a week I run checks to make sure the systems are still matching up and that no one has manually amended one and not the other, which always happens!
<o:p></o:p>
There are 30000 odd lines and removing duplicates, advanced filtering and highlighting dubs and then filtering isn't working.<o:p></o:p>
<o:p></o:p>
I think this is because there are just too many lines. I need to filter out the information where duplicate codes don't exist. For example - <o:p></o:p>
<o:p></o:p>
Code Description Price System 1 or 2?<o:p></o:p>
000000 Item 1 1.00 1<o:p></o:p>
000000 Item 1 1.00 2<o:p></o:p>
000001 Item 2 1.50 1<o:p></o:p>
000001 Item 2 2.00 2<o:p></o:p>
000002 Item 3 3.00 1<o:p></o:p>
000002 Item 3 3.00 2<o:p></o:p>
000003 Item 4 2.50 2<o:p></o:p>

I want the duplicate lines to dissappear and leave me the ones that don't. In the above case Code 000001 (x2 lines) and 000003 (added to one but not the other!) should remain for me to look into.

Is there a way this can be done? If you need anymore information please let me know.

Thanks in advance for any help!!
 

Excel Facts

Remove leading & trailing spaces
Save as CSV to remove all leading and trailing spaces. It is faster than using TRIM().
Is the data sorted so that duplicate lines are together?
 
Upvote 0
Assuming the values are in columns A, B, C & D....
Starting on the second row of data, in column G (random free column)…
=AND(INDIRECT("A"&ROW())=INDIRECT("A"&ROW()-1),INDIRECT("B"&ROW())=INDIRECT("B"&ROW()-1),INDIRECT("C"&ROW())=INDIRECT("C"&ROW()-1))

The cells in cols A, B & C will be compared to the same cells in the row above. If TRUE, the row is a duplicate of the row above

Then another column to look at the TRUE/FALSE values in G and the System No.
=IF(OR(INDIRECT("G"&ROW()),AND(INDIRECT("G"&ROW()+1),INDIRECT("D"&ROW())=1)),"Dupe","Not Dupe")

I've used INDIRECT so that the formulas are not row specific
The first row of data may give you an incorrect answer - haven't fully checked this.

Worth a try !
 
Upvote 0
Have a formula to flag the rows you want, like this:

Excel Workbook
ABCDEF
1CodeDescriptionPriceSystem1 or 2?Filter
20Item111TRUE
30Item112TRUE
41Item21.51FALSE
51Item222FALSE
62Item331TRUE
72Item332TRUE
83Item42.52FALSE
Sheet3


Filter to hide the TRUE values in column F.
 
Upvote 0
Test this in a copy of your workbook.

I've assumed data is in columns A:D.



<font face=Courier New><br><SPAN style="color:#00007F">Sub</SPAN> Unmatched()<br>    <SPAN style="color:#00007F">Dim</SPAN> d, a<br>    <SPAN style="color:#00007F">Dim</SPAN> r <SPAN style="color:#00007F">As</SPAN> <SPAN style="color:#00007F">Long</SPAN>, LR <SPAN style="color:#00007F">As</SPAN> <SPAN style="color:#00007F">Long</SPAN>, k <SPAN style="color:#00007F">As</SPAN> <SPAN style="color:#00007F">Long</SPAN><br>    <SPAN style="color:#00007F">Dim</SPAN> s <SPAN style="color:#00007F">As</SPAN> <SPAN style="color:#00007F">String</SPAN><br>    <br>    LR = Range("A" & Rows.Count).End(xlUp).Row<br>    d = Range("A1:C" & LR).Value<br>    <SPAN style="color:#00007F">ReDim</SPAN> a(1 <SPAN style="color:#00007F">To</SPAN> LR, 1 <SPAN style="color:#00007F">To</SPAN> 1)<br>    r = 1<br>    <SPAN style="color:#00007F">Do</SPAN><br>        r = r + 1<br>        <SPAN style="color:#00007F">If</SPAN> d(r, 1) = d(r - 1, 1) <SPAN style="color:#00007F">Then</SPAN><br>            <SPAN style="color:#00007F">If</SPAN> d(r, 2) = d(r - 1, 2) <SPAN style="color:#00007F">Then</SPAN><br>                <SPAN style="color:#00007F">If</SPAN> d(r, 3) = d(r - 1, 3) <SPAN style="color:#00007F">Then</SPAN><br>                    k = k + 2<br>                    a(r, 1) = 1<br>                    a(r - 1, 1) = 1<br>                    r = r + 1<br>                <SPAN style="color:#00007F">End</SPAN> <SPAN style="color:#00007F">If</SPAN><br>            <SPAN style="color:#00007F">End</SPAN> <SPAN style="color:#00007F">If</SPAN><br>        <SPAN style="color:#00007F">End</SPAN> <SPAN style="color:#00007F">If</SPAN><br>    <SPAN style="color:#00007F">Loop</SPAN> <SPAN style="color:#00007F">While</SPAN> r < LR<br>    <SPAN style="color:#00007F">If</SPAN> k > 0 <SPAN style="color:#00007F">Then</SPAN><br>        Range("E1:E" & LR).Value = a<br>        <SPAN style="color:#00007F">With</SPAN> Range("A1:E" & LR)<br>            .Sort Key1:=.Cells(2, .Columns.Count), _<br>                Order1:=xlAscending, Header:=xlGuess<br>            .Offset(1).Resize(k).EntireRow.Delete<br>        <SPAN style="color:#00007F">End</SPAN> <SPAN style="color:#00007F">With</SPAN><br>    <SPAN style="color:#00007F">End</SPAN> <SPAN style="color:#00007F">If</SPAN><br><SPAN style="color:#00007F">End</SPAN> <SPAN style="color:#00007F">Sub</SPAN><br></FONT>
 
Upvote 0
Thank you both for your help, both worked, so I will play and see which one is best suited for the work I am doing.

Thank you, have a nice weekend
 
Upvote 0

Forum statistics

Threads
1,224,552
Messages
6,179,487
Members
452,917
Latest member
MrsMSalt

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top