Need help with complex data extraction

Goofi

New Member
Joined
Feb 10, 2005
Messages
9
Hi,

I have a complex problem for extracting information from a master spreadsheet.

Columns = ACCOUNT | CUSTOMER | TYPE | SERVICE | EMPLOYEES

I am trying to find duplicated account numbers from a list of 40,000. I created the master by extracting information from two seperate spreadsheets (A and B) with similar formats above. Both spreadsheets contain information required for the account numbers. However, the information is sporadic on both. Some of the data was complete and some wasn't. Hence the creation of a master to consolidate.

I've noticed after sorting by the ACCOUNT's column I have duplication but some details are missing from the other columns in some entries compared to the other, for example:

Sorted by ACCOUNT.

Account |Customer | Type | Service | Employees

123456 |ABC | | Star |

123456 |ABC | BUS | Star | 10

123456 |ABC | | |

Ideally, I would like the second entry with all the complete information to remain in the master and the incomplete ones to be extracted to another worksheet or book for record keeping.
I tried using Advanced filter and checking Unique Records Only but it does not help as each entry is still unique.
Therefore, I would still have to go through all 40000 rows to see which entry to keep out of the duplicates. Which, is a massive undertaking considering that this is only one file of fifty I have to create!

Is there a 'simple' code, function or formula to automate the search through the spreadsheet for duplicates and extract the entries that do not have complete information? I am not too savvy with Excel so I would appreciate some direction.

BTW - I am using Excel 2000.

Thank you in advance

Goofi.

:confused:
 

Excel Facts

Why are there 1,048,576 rows in Excel?
The Excel team increased the size of the grid in 2007. There are 2^20 rows and 2^14 columns for a total of 17 billion cells.
HI

Assuming that you have your source data starting in A1 and filling columns A:E:

1) Generate a unique list of the account numbers in column G. This can be done using the advanced filter.
2) IN cell H1 array enter the formula
=INDEX($B$1:$B$4,MAX(($A$2:$A$4=G1)*(NOT(ISBLANK($B$2:$B$4)))*ROW($B$2:$B$4)))
3) In cell I1 array enter the formula
=INDEX($C$1:$C$4,MAX(($A$2:$A$4=G1)*(NOT(ISBLANK($C$2:$C$4)))*ROW($B$2:$B$4)))
4) In cell J1 array enter the formula
=INDEX($D$1:$D$4,MAX(($A$2:$A$4=G1)*(NOT(ISBLANK($D$2:$D$4)))*ROW($B$2:$B$4)))
5) In cell K1 array enter the formula
=INDEX($E$1:$E$4,MAX(($A$2:$A$4=G1)*(NOT(ISBLANK($E$2:$E$4)))*ROW($B$2:$B$4)))

To array enter press the shift key, ctrl key and enter at the same time.

You will have to adjust the ranges to suit your data. Once they are entered, copy them down. This will give a list of the most complete information for each account number.

Once you have this list, you can (a) copy to the master list and (b) find the entries that provide a complete match and delete them. SUMPRODUCT can be used to identify those that match.

HTH

Tony
 
Upvote 0
In a separate column enter the formula:

=counta(a2:f2)

This will give you the count of non-blank cells.

For the Advanced Filter, set up the criteria with Account and the new Column. extract unique values to Account criteria an extract the max values to new column.

With this criteria, unique records can be extracted.
 
Upvote 0

Forum statistics

Threads
1,203,617
Messages
6,056,310
Members
444,858
Latest member
ucbphd

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top