Combine Spreadsheets

mcho

New Member
Joined
Nov 22, 2011
Messages
4
Excel 2007

I would like to combine a number of spreadsheets that are very different but have one common column: Course number. All the course numbers are from 3 to 5 digits and are of the same data-type in all spreadsheets (numeric, general).

Each spreadsheet has different information about courses (mostly text and dates) that may have different cell formatting. All were created by different people at different times for different reasons. None of the spreadsheets have exactly the same list of course numbers. All the spreadsheets contain the majority of the course numbers but some may be missing some numbers, and others may have more numbers.

My goal is to combine all the data for each course number into one spreadsheet. Obviously some courses will be missing data because that course number was not on the particular spreadsheet that had that data.

So the final spreadsheet would have all the unique courses numbers (from all the spreadsheets) and all the columns of data from those same spreadsheets.

A simple example is this: One spread sheet has the course numbers 1,4,5,6,9, and has the name of the course owner for each course. Another spreadsheet has courses 1, 3,4 6,7,8,9 and has the date the course was last reviewed.

The final spread sheet would have the course numbers 1,3,4,5,6,7,8,9 and have column for the course owner and one for the date it was last reviewed. Of course some cells would be blank since there is no data.

Of course the real problem is much more complex with 600 course numbers and about a dozen spreadsheets, each with multiple columns. Which is why it can't be done by hand.

I have tried Vlookup but it will not work because the list of course numbers in each spreadsheet is not identical. As soon as it gets to a number that is not in the same position in both spreadsheets it starts providing the dreaded "#NA." I also tried consolidate, but that was a nonstarter.
 

Excel Facts

Is there a shortcut key for strikethrough?
Ctrl+S is used for Save. Ctrl+5 is used for Strikethrough. Why Ctrl+5? When you use hashmarks to count |||| is 4, strike through to mean 5.
Just to clarify:

1) You have many spreadsheet in one directory? What is the directory?
2) What are the files extension? .xls?
3) Do you have the same title called "course numbers" for all the different spreadsheets?
4) Is the "Course numbers" column the same among all the spreadsheet?
4) Are the titles in the spreadsheet store in the same row? Which row?
 
Upvote 0
Just to clarify:

1) You have many spreadsheet in one directory? What is the directory?

I can create whatever is needed. I did put them all the same directory. The directory is "Team Leader Stuff" but I can create whatever is needed. Everything is arbitrary and so there is a great deal of flexibility with respect to manipulating the files and data.

2) What are the files extension? .xls?
Yes. I always save in the old format. But I can convert if needed.

3) Do you have the same title called "course numbers" for all the different spreadsheets?
Yes, identical

4) Is the "Course numbers" column the same among all the spreadsheet?

Yes it is the first column.

4) Are the titles in the spreadsheet store in the same row? Which row?

Yes in the first row.
 
Last edited:
Upvote 0
Hi mcho,

Here is the code, (So far the longest that I have written):

Change the file directory where highlighted in red


But please have this assumtion (Please follow strictly):
  1. Assume each Excel workbook has only one sheet (if it is not, then at least the active sheet is the one contains the data)
  2. FileDir store the Directory that all the file contains
  3. Excel extension is .xls
  4. Each Excel file has a title at row 1
  5. Each Column has a title
  6. This Main file has two Column with the title "Course Number" and "Modified Date"
  7. The macro start at the main file
Code:
Sub GetFileList()
    'Assume each Excel workbook has only one sheet
    'FileDir store the Directory that all the file contains
    'Excel extension is .xls
    'Each Excel file has a title at row 1
    'Each Column has a title
    'This Main file has two Column with the title "Course Number" and "Modified Date"
    'The macro start at the main file
 
 
    Dim FileArray() As Variant
    Dim FileCount, i, ii, last_column, last_column_m, last_row_m As Integer
    Dim FileDir, FileSearch, FileName, Main_workbook As String
    Dim my_range As Range
    Dim Record_column_array()
    Dim Record_column_status_array()
    Dim Children_file_value()
    Dim r, CurrFileDate
 
 
 
    FileDir = [COLOR=red]"C:\Documents and Settings\Alvin Wong\Desktop\New Folder\"[/COLOR]
    FileSearch = "*.xls"
 
 
    FileCount = 0
    FileName = Dir(FileDir & FileSearch)
 
    Main_workbook = ActiveWorkbook.name
 
    Do While FileName <> ""
        CurrFileDate = FileDateTime(FileName)
        Workbooks.Open FileName:= _
        FileDir & FileName
 
 
        With Workbooks(FileName).Sheets(ActiveSheet.name)
            'Record all the Title
            last_column = .Range(.Cells(1, .Columns.count), .Cells(1, .Columns.count)).End(xlToLeft).Column
            If last_column >= 2 Then
                Set my_range = .Range(.Cells(1, 2), .Cells(1, last_column))
                Erase Record_column_array, Record_column_status_array, Children_file_value
                ReDim Preserve Record_column_array(1 To last_column - 1)
                ReDim Preserve Record_column_status_array(1 To last_column - 1)
                ReDim Preserve Children_file_value(1 To last_column - 1)
                i = 1
                For Each r In my_range
                    If r.Value <> "" Then
                        With Workbooks(Main_workbook).Sheets(ActiveSheet.name)
                            Set c = .Range("1:1").Find(r.Value, LookIn:=xlValues, lookat:=xlWhole)
                            If Not c Is Nothing Then
                                Record_column_array(i) = c.Column
                                Record_column_status_array(i) = 1 'Repeated Column
                                i = i + 1
                            Else
                                last_column_m = .Range(.Cells(1, .Columns.count), .Cells(1, .Columns.count)).End(xlToLeft).Column + 1
                                .Range(.Cells(1, last_column_m), .Cells(1, last_column_m)).Value = r.Value
                                Record_column_array(i) = last_column_m
                                Record_column_status_array(i) = 0 'non Repeated Column
                                i = i + 1
                            End If
                        End With
                    End If
                Next
 
                'Record all the Value
                x = 2
                Do
                Course_Number = .Range("A" & x).Value
                For ii = 1 To i - 1
                    Children_file_value(ii) = .Range(.Cells(x, ii + 1), .Cells(x, ii + 1)).Value
                Next
                With Workbooks(Main_workbook).Sheets(ActiveSheet.name)
                    Set c = .Range("A:A").Find(Course_Number, LookIn:=xlValues, lookat:=xlWhole)
                    If Not c Is Nothing Then
                        .Range("B" & c.Row).Value = CurrFileDate
                        For ii = 1 To i - 1
                            If Record_column_status_array(ii) = 1 Then
                                If CurrFileDate >= .Range("B" & c.Row).Value Then
                                    .Range(.Cells(c.Row, Record_column_array(ii)), .Cells(c.Row, Record_column_array(ii))).Value = Children_file_value(ii)
                                End If
                            Else
                                .Range(.Cells(c.Row, Record_column_array(ii)), .Cells(c.Row, Record_column_array(ii))).Value = Children_file_value(ii)
                            End If
                        Next
                    Else
                        last_row_m = .Range("A" & Rows.count).End(xlUp).Row + 1
                        .Range("A" & last_row_m).Value = Course_Number
                        .Range("B" & last_row_m).Value = CurrFileDate
                        For ii = 1 To i - 1
                            .Range(.Cells(last_row_m, Record_column_array(ii)), .Cells(last_row_m, Record_column_array(ii))).Value = Children_file_value(ii)
                        Next
                    End If
                End With
                x = x + 1
                Loop While .Range("A" & x).Value <> ""
            End If
        End With
 
        Workbooks(FileName).Close
        FileName = Dir()
    Loop
 
 
End Sub
 
Upvote 0
THANK YOU!! Sorry for the long delay in replying to you but this was Thanksgiving in the US and I have been away since last Wednesday. I have not had a chance to try this yet as I have just returned. But I will let you know how it goes.

By the way, while I was on the plane, I converted the spreadsheets to tables, but being new to Libre Office (which is what I have on my laptop - I can't afford to have Microsoft on every computer I own.) I haven't figured out how to query both tables into a usable. If this script works then it will be a moot point.
 
Upvote 0
There is a small correction below:

Code:
Sub GetFileList()
    'Assume each chirExcel workbook has only one sheet
    'FileDir store the Directory that all the file contains
    'Excel extension is .xls
    'Each Excel file has a title at row 1
    'Each Column has a title
    'This Main file has two Column with the title "Course Number" and "Modified Date"
    'The macro start at the main file
    
    
    Dim FileArray() As Variant
    Dim FileCount, i, ii, last_column, last_column_m, last_row_m As Integer
    Dim FileDir, FileSearch, FileName, Main_workbook As String
    Dim my_range As Range
    Dim Record_column_array()
    Dim Record_column_status_array()
    Dim Children_file_value()
    Dim r, CurrFileDate
    
   
   
    FileDir = "C:\Documents and Settings\Alvin Wong\Desktop\New Folder\"
    FileSearch = "*.xls"
    
  
    FileCount = 0
    FileName = Dir(FileDir & FileSearch)
    
    Main_workbook = ActiveWorkbook.name
    
    Do While FileName <> ""
        CurrFileDate = FileDateTime([COLOR=red]FileDir & FileName[/COLOR])
        Workbooks.Open FileName:= _
        FileDir & FileName
        
            
        With Workbooks(FileName).Sheets(ActiveSheet.name)
            'Record all the Title
            last_column = .Range(.Cells(1, .Columns.count), .Cells(1, .Columns.count)).End(xlToLeft).Column
            If last_column >= 2 Then
                Set my_range = .Range(.Cells(1, 2), .Cells(1, last_column))
                Erase Record_column_array, Record_column_status_array, Children_file_value
                ReDim Preserve Record_column_array(1 To last_column - 1)
                ReDim Preserve Record_column_status_array(1 To last_column - 1)
                ReDim Preserve Children_file_value(1 To last_column - 1)
                i = 1
                For Each r In my_range
                    If r.Value <> "" Then
                        With Workbooks(Main_workbook).Sheets(ActiveSheet.name)
                            Set c = .Range("1:1").Find(r.Value, LookIn:=xlValues, lookat:=xlWhole)
                            If Not c Is Nothing Then
                                Record_column_array(i) = c.Column
                                Record_column_status_array(i) = 1 'Repeated Column
                                i = i + 1
                            Else
                                last_column_m = .Range(.Cells(1, .Columns.count), .Cells(1, .Columns.count)).End(xlToLeft).Column + 1
                                .Range(.Cells(1, last_column_m), .Cells(1, last_column_m)).Value = r.Value
                                Record_column_array(i) = last_column_m
                                Record_column_status_array(i) = 0 'non Repeated Column
                                i = i + 1
                            End If
                        End With
                    End If
                Next
                
                'Record all the Value
                x = 2
                Do
                Course_Number = .Range("A" & x).Value
                For ii = 1 To i - 1
                    Children_file_value(ii) = .Range(.Cells(x, ii + 1), .Cells(x, ii + 1)).Value
                Next
                With Workbooks(Main_workbook).Sheets(ActiveSheet.name)
                    Set c = .Range("A:A").Find(Course_Number, LookIn:=xlValues, lookat:=xlWhole)
                    If Not c Is Nothing Then
                        .Range("B" & c.Row).Value = CurrFileDate
                        For ii = 1 To i - 1
                            If Record_column_status_array(ii) = 1 Then
                                If CurrFileDate >= .Range("B" & c.Row).Value Then
                                    .Range(.Cells(c.Row, Record_column_array(ii)), .Cells(c.Row, Record_column_array(ii))).Value = Children_file_value(ii)
                                End If
                            Else
                                .Range(.Cells(c.Row, Record_column_array(ii)), .Cells(c.Row, Record_column_array(ii))).Value = Children_file_value(ii)
                            End If
                        Next
                    Else
                        last_row_m = .Range("A" & Rows.count).End(xlUp).Row + 1
                        .Range("A" & last_row_m).Value = Course_Number
                        .Range("B" & last_row_m).Value = CurrFileDate
                        For ii = 1 To i - 1
                            .Range(.Cells(last_row_m, Record_column_array(ii)), .Cells(last_row_m, Record_column_array(ii))).Value = Children_file_value(ii)
                        Next
                    End If
                End With
                x = x + 1
                Loop While .Range("A" & x).Value <> ""
            End If
        End With
        
        Workbooks(FileName).Close
        FileName = Dir()
    Loop
    
    
End Sub
 
Upvote 0
Thank you again for figuring this out. I tried the script prior to the correction, and got an open do loop error. However after the correction I received no errors. But nothing happened either. I will continue to play with it and see if I can get it to work. However my problem is solved. I tired Active Data (from the same people that run this forum) and it worked perfectly in merging the spreadsheets (albeit two at a time).

This was surprising because the support person told me the program wasn't capable of doing what I want. Yet it did. Never had that happen before!! Its usually the opposite where the vendor's claim that the software works is a bit too optimistic.

Since you put a lot of effort into this script though I will try to figure out why it doesn't work and see if I can fix the problem.
 
Upvote 0

Forum statistics

Threads
1,214,583
Messages
6,120,380
Members
448,955
Latest member
BatCoder

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top