Data extraction & duplicate elimination

nathanhutto

New Member
Joined
Dec 27, 2010
Messages
4
Hi all,

I am facing a two-pronged challenge:

1. I have a workbook with school rosters. Each sheet in the book is a school roster from a different week (i.e., sheet 1 = 06/01/10-06/08/10, sheet 2 = 06/09/10-06/16/10, etc). Most of the students are simply carried over week to week, although each week, some students are enrolled and some exit. I have a full year of this. I want to extract three pieces of info from each sheet into another workbook: first name, last name, date of birth. This information is included in each row. I could copy and paste this info, but that is time consuming and tedious. Is there another way?

2. Assuming #1 is possible, I will then be faced with duplicate records. This is because, from week to week, most students are carried over from the previous worksheet. Is it possible to do #1, but omit duplicates? Or, how can I eliminate duplicates once I've extracted all student names and DOBs? I can sort and manually delete, but there must be a simpler way.

I'm happy to answer follow up questions. Thanks in advance for any feedback.

Best,
Nathan
 

Excel Facts

What is =ROMAN(40) in Excel?
The Roman numeral for 40 is XL. Bill "MrExcel" Jelen's 40th book was called MrExcel XL.
1. What version of Excel are you using?

2. What columns are first name, last name, date of birth in on each sheet? Do the columns have headings in row 1?

3. What is the name of this workbook that contains all these sheets?

4. Are there any other sheets in this workbook apart from these weekly lists? If so, what are they as they will need to be by-passed in this process.

5. Does the 'other workbook' already exist? If so,
- what is its name?
- what sheet in it do the results go on?
- if there is already data in that sheet should it be removed?
 
Last edited:
Upvote 0
Hi Peter,

Thanks for the questions. Here are the answers:

1. 2010
2. Last name is column D, first is E, and DOB is H. There is a title in row 1, headings in row 2, and data begins in row 3.
3. The name of the workbook is "JP Roster."
4. There are no other sheets.
5. The other workbook does exist. It's called "FY11 CCC Analysis." The results go in the sheet "FY11 CCC Analysis." There is data in this sheet already. I had initially experimented with a vlookup (until I realized that it was too cumbersome a process). Last name, first name, and DOB (this column currently blank) are in columns A, B, and C respectively.

Any help you could provide would be greatly appreciated!

Thanks,
Nathan
 
Upvote 0
Nathan

You didn't answer my last question about removing any existing data from the destination sheet. I have assumed that (apart from the headings) the existing data is to be deleted.

Post back if this is incorrect and you need help to modify the code to allow for that.

Please ensure you have backup copies of your workbooks before testing the code below.

I have also assumed that both workbooks will already be open when the code is run.

VBA Code:
Sub Roster_List()
  Dim wb As Workbook, wbJP As Workbook
  Dim ws As Worksheet, wsDest As Worksheet
  Dim nr As Long, lr As Long, i As Long, Rws As Long, Cols As Long
  Dim sCols As String
  Dim vRows, vData, aCols, aDupCols
  
  aCols = Array("D", "E", "H") '<-- JP Roster cols to copy
  Const fr As Long = 3 '<-- 1st row actual data in JP Roster
  
  Application.ScreenUpdating = False
  For Each wb In Workbooks
    If wb.Name Like "FY11 CCC Analysis*" Then
      Set wsDest = wb.Sheets("FY11 CCC Analysis")
    End If
    If wb.Name Like "JP Roster*" Then
      Set wbJP = wb
    End If
  Next wb
  wsDest.UsedRange.Offset(1).ClearContents
  nr = 2
  Cols = UBound(aCols) - LBound(aCols) + 1
  ReDim Preserve aCols(1 To Cols)
  ReDim aDupCols(0 To Cols - 1)
  For i = 1 To Cols
    sCols = sCols & " " & Columns(aCols(i)).Column
    aDupCols(i - 1) = i
  Next i
  sCols = Replace(sCols, " ", "", 1, 1)
  For Each ws In wbJP.Worksheets
    lr = ws.UsedRange.Rows.Count
    If lr >= fr Then
      Rws = lr - fr + 1
      vRows = Evaluate("row(" & fr & ":" & lr & ")")
      vData = Application.Index(ws.Cells, vRows, Split(sCols))
      wsDest.Cells(nr, 1).Resize(Rws, Cols).Value = vData
      nr = nr + Rws
    End If
  Next ws
  wsDest.UsedRange.RemoveDuplicates _
  Columns:=(aDupCols), Header:=xlYes
  Application.ScreenUpdating = True
End Sub
 
Last edited:
Upvote 0

Forum statistics

Threads
1,215,219
Messages
6,123,684
Members
449,116
Latest member
HypnoFant

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top