amerifax-heather
New Member
- Joined
- May 29, 2011
- Messages
- 3
I am going to try my best to explain my situation. I am using Excel 2007. I have an extremely large data file (1,010,165 records to be exact). It is a combination of 2 different data files (2010 and 2005) with the same exact structure. Between the 2 files I need to know if I have any duplicates in 2 different scenarios, Columns A-F and Columns B-E. I want to keep the record from 2010 and mark it as MST and delete the record from 2005 (eventually), before it is deleted I need to have a record of that duplicate in the MST record as either 2005(A-F) or 2005(B-E) in a field called "ORG".
Columns A-F contain - Name, Addr, City, St, Zip, Phn
Columns B-E contain - Addr, City, St, Zip, Phn
What I need to determine is this...Each of the records have an identifier in a Field called "YR-Q" which contains either 2010 or 2005. I need to search the records and find out where there is a duplicate using columns A-F. When a duplicate is found, I need to mark the 1st rec (if YR-Q=2010 as MST or if YR-Q=2005 as DUP). I also need to record the result in the MST record as "2005(A-F)" in a field called "ORG". And then run the same scenario using columns B-E.
The other thing I am going to need to know is if the Address information (Columns B-E) is unique (not duplicated) where YR-Q=2005. If this is the case I would like to put the value "2005(ADDR)" in the "ORG" field.
I should also tell you that before the files are joined together each individual file is checked for duplicates within the file and all duplicates removed. So the only occurrence of a duplicate will occur between the 2010 file and the 2005 file.
I hope I've explained myself well enough. Thank you in advance for any help or guidance you can give.
Heather
Columns A-F contain - Name, Addr, City, St, Zip, Phn
Columns B-E contain - Addr, City, St, Zip, Phn
What I need to determine is this...Each of the records have an identifier in a Field called "YR-Q" which contains either 2010 or 2005. I need to search the records and find out where there is a duplicate using columns A-F. When a duplicate is found, I need to mark the 1st rec (if YR-Q=2010 as MST or if YR-Q=2005 as DUP). I also need to record the result in the MST record as "2005(A-F)" in a field called "ORG". And then run the same scenario using columns B-E.
The other thing I am going to need to know is if the Address information (Columns B-E) is unique (not duplicated) where YR-Q=2005. If this is the case I would like to put the value "2005(ADDR)" in the "ORG" field.
I should also tell you that before the files are joined together each individual file is checked for duplicates within the file and all duplicates removed. So the only occurrence of a duplicate will occur between the 2010 file and the 2005 file.
I hope I've explained myself well enough. Thank you in advance for any help or guidance you can give.
Heather
Last edited: