'IMPORTING FILES WHICH EXCEED EXCEL 2003 LIMITS
'If 2007(xlsx (cvs)) first down load into Word then save as text.
Option Explicit
Option Compare Text
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' modImportBigFiles
' This module contains the ImportBigTextFile procedure and the IsFileOpen procedure.
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Sub ImportBigTextFile()
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' ImportBigTextFiles
' By Chip Pearson, chip@cpearson.com, www.cpearson.com
' This procedure is on http://www.cpearson.com/excel/ImportBigFiles.htm
' This procedure requires Excel 2000 or later.
'
' An Excel worksheet (2003 and earlier) is limited to 65,536 rows. You cannot increase this
' limit. Therefore, if you use Excel's tools to import a text or csv file with more
' than 65,536 rows of data, Excel will only import the first 65,536 records (or fewer,
' depending on the row in which you start the import). This procedure may be used to import a text
' or CSV file of any number of rows. It will begin the import by placing the imported data on the sheet
' named in C_START_SHEET_NAME starting in row C_START_ROW_FIRST_PAGE.
'
' Note that the import process always writes records to the ActiveWorkbook, which may or may not
' be the same workbook that contains this code. Ensure that the proper workbook is active
' prior to running the import process.
'
' RowNdx is the current row number where the input record will be placed. When any one or more
' of the following are true:
' RowNdx > Rows.Count
' RowNdx > LastRowForInput
' RowsThisSheet > MaxRowsPerSheet
' the procedure will create a new worksheet immediately following the current worksheet and
' name it
' C_SHEET_NAME_PREFIX & Format(SheetNumber, "0")
'
' The procedure will optionally split the input data into multiple columns, delimited by a
' character specified in the SplitChar variable.
'
' The following constants and variable determine how the procedure handles data and new worksheets:
'
' C_START_ROW_FIRST_PAGE is the row number on the first worksheet (C_START_SHEET_NAME)
' where the data should be placed. This is used if you have
' header rows that you want to preserve.
'
' C_START_ROW_LATER_PAGES is the row number on subsequent (sheet 2, 3, etc) that the
' data should be started.
'
' C_START_SHEET_NAME is the name of an existing worksheet. The imported data will
' start by filling this sheet before creating a new worksheet.
'
' C_START_COLUMN is the column in which the data is placed. If SplitChar is
' vbNullString, the entire line of input data will be placed
' in this column. If SplitChar is not vbNullString, the input
' data is split into an array using SplitChar as a delimiter,
' and each data element will be put in its own column, starting
' with C_START_COLUMN.
'
' C_TEMPLATE_SHEET_NAME is the name of an existing worksheet that is used as the
' template for each new sheet created by the function. This
' must NOT be the same as C_START_SHEET_NAME, since copying
' C_START_SHEET_NAME would copy all of the imported (so far)
' data to the new sheet. If you don't want to use a template
' sheet, set this constant to vbNullString.
'
' C_UPDATE_STATUSBAR_EVERY_N_RECORDS
' is the number of records after which the a message will
' be displayed in the StatusBar. Set this value to <= 0 if you
' do not want any StatusBar updates. A typical value for this
' constant is 1000.
'
' C_STATUSBAR_TEXT is the text to be displayed in the StatusBar. The current record
' count will be appnded to this text. This value is not used
' if C_UPDATE_STATUSBAR_EVERY_N_RECORDS is <= 0. The string
' value in this constant should include a trailing space. A
' typical value for this constant is "Processing Record: "
'
' C_SHEET_NAME_PREFIX is a string value which will be used for naming newly created
' worksheets. This constant must have <= 28 characters, which
' leaves the procedure three characters for the numeric suffix.
' This allows for importing 65,404,928 rows. If there are more
' rows, the proceudre will import them, but the worksheet names
' may not be named properly. They will, however, be in the proper
' sequence.
'
' LastRowForInput is the last row number on a worksheet that the imported data
' should be placed. Set this to either <= 0 or to Rows.Count
' to import data to the last row of the worksheet.
'
' MaxRowsPerSheet is the maximum number of rows to import on to a single worksheet.
' Set this to <= 0 or Rows.Count to import to the last row
' of a worksheet. While the procedure can properely handle values
' in both LastRowForInput and MaxRowsPerSheet, you will most likely
' set both to <=0 or set one to <= 0 and the other to its appropriate
' value. Typically, both variables will not have non-zero values.
'
' SplitChar is the character which delimits the data fields in the input data.
' If this character is vbNullString, the entire input line of data
' will be placed in C_START_COLUMN. If SplitChar is not vbNullString,
' it is truncated to 1 character (the left-most character) and then used
' in Split function to split the input data into an array, delimited by SplitChar.
' Each element of the array is put into its own column, starting with
' C_START_COLUMN and proceeding to the right. SplitChar is typically set
' to a comma, semicolon, or vbTab, but it can be any character.
'
' The procedure saves the following Application properties, then turns them off. It will restore the saved
' values at the end of the procedure.
' Calculation
' ScreenUpdating
' DisplayAlerts
' EnableEvents
'
'-------------------------
' Possible Data Loss
'-------------------------
' It is possible that some data loss may occur if the function is splitting the input data into multiple columns.
' If the number of data elements in the input record plus the value of StartColumn is greater than the number
' of Columns in the worksheet, data elements that would go past the right-most column of the worksheet are not imported.
' The number of these truncated records is stored in the TruncatedCount variable and is displayed at the end
' of the procedure.
'
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' data starts on this row on the first sheet
Const C_START_ROW_FIRST_PAGE = 1
' data starts on this row for all subsequent sheets
Const C_START_ROW_LATER_PAGES = 2
' worksheet name where data should start. This sheet must exist.
Const C_START_SHEET_NAME = "Sheet1"
' what column do we start placing the data
Const C_START_COLUMN = 1
' newly created worksheets will be named C_SHEET_NAME_PREFIX & Format(SheetNum,"0")
Const C_SHEET_NAME_PREFIX = "DataImport"
' newly created worksheets will be based on this template sheet. set to vbNullString if
' you don't want to use a template sheet and use a blank sheet instead.
Const C_TEMPLATE_SHEET_NAME = vbNullString
' update the Application.StatusBar every C_UPDATE_STATUSBAR_EVERY_N_RECORDS records.
' set this to 0 if you don't want status bar messages.
Const C_UPDATE_STATUSBAR_EVERY_N_RECORDS = 1000
' this is the message to be displayed in the status bar. The number of records
' read so far will be appended to this value.
Const C_STATUSBAR_TEXT = "Processing Record: "
Dim RowNdx As Long ' Current RowNumber
Dim Colndx As Long ' Current Column
Dim FName As Variant ' Input file name
Dim FNum As Integer ' Filenumber returned by FreeFile
Dim WS As Worksheet ' Worksheet on which the data should be placed
Dim InputLine As String ' The entire line of text read from the input file
Dim Arr As Variant ' Used with Split to break InputLine into an array,
' delimited by SplitChart
Dim SplitChar As String ' The character used by Split. This character delimits
' the input data fields in InputLine. Typically, this
' character will be a comma, semicolon, or vbTab.
' If this character is vbNullString, the input data
' won't be split, and the entire InputLine will be
' put in column C_START_COLUMN
Dim SheetNumber As Long ' Increments for each worksheet we populate with data
Dim SaveCalc As XlCalculation ' Caller's Calculation mode.
Dim SaveScreenUpdating As Boolean ' Caller's ScreenUpdating mode
Dim SaveDisplayAlerts As Boolean ' Caller's DisplayAlerts property
Dim SaveEnableEvents As Boolean ' Caller's EnableEvents property
Dim InputCounter As Long ' Counter of all records imported
Dim LastRowForInput As Long ' Indicate the last row on the worksheet than
' input data should be used. Set this to a value <= Rows.Count.
Dim MaxRowsPerSheet As Long ' The maximumn number of rows to import on each sheet.
' Set this to <= 0 if you don't want to use this parameter.
Dim RowsThisSheet As Long ' Keeps track of the rows imported on to the current sheet.
Dim TruncatedCount As Long ' Counts the number of records whose input was truncated because
' it would have gone past the last column of the worksheet.
SheetNumber = 1
'''''''''''''''''''''''''''''''''''''''''''''
' Ensure we have an active workbook.
'''''''''''''''''''''''''''''''''''''''''''''
If Application.ActiveWorkbook Is Nothing Then
MsgBox "There is no active workbook."
Exit Sub
End If
''''''''''''''''''''''''''''''''''''''''''''''
' Set what character we're going to
' use to split apart the input line. The
' input line will be SPLIT on this character,
' and each text field will go into its own
' column. Data will be separated by
' SplitChar into multiple columns, starting
' on column C_START_COLUMN. If SplitChar is
' vbNullString, the entire input line is
' placed in C_START_COLUMN. SplitChar must
' be a single character. It is typically
' a comma, semicolon, or vbTab, but it can
' be any character. If SplitChar is set to
' more than one character, it is truncated
' to a single (the left-most) character.
' When placing data elements in separate
' columns, it is possible that the number of
' imported elements would extend past the
' last column of the worksheet. The count
' of records whose input was truncated because
' it would have gone past the last column
' of the worksheet is stored in the TruncatedCount
' variable. The value of this variable is
' displayed at the end of the procedure.
'
''''''''''''''''''''''''''''''''''''''''''''''
SplitChar = ","
''''''''''''''''''''''''''''''''''''''''''''''
' Set the maximum number of data input rows
' to place on each worksheet. Set this
' value to <= 0 or to Rows.Count to fill
' the entire worksheet.
''''''''''''''''''''''''''''''''''''''''''''''
MaxRowsPerSheet = Rows.Count
'''''''''''''''''''''''''''''''''''''''''''''
' Set the LastRowForInput value. This when
' this row number is reached, a new worksheet
' will be created. Set this to 0 or
' WS.Rows.Count to fill the entire worksheet.
'''''''''''''''''''''''''''''''''''''''''''''
LastRowForInput = ActiveWorkbook.Worksheets(1).Rows.Count
''''''''''''''''''''''''''''''''''''''''''''''''''
' Ensure that C_SHEET_NAME_PREFIX is <= 29
' characters. This leaves us two characters for
' the numeric suffix, or 99 added sheets. If
' more sheets are needed, they will be created,
' and the data will be imported, but the sheets
' will have the default Excel-generated name, not
' the C_SHEET_NAME_PREFIX name. They will be in
' the correct order.
''''''''''''''''''''''''''''''''''''''''''''''''''
If (Len(C_SHEET_NAME_PREFIX) < 1) Or (Len(C_SHEET_NAME_PREFIX) > 29) Then
MsgBox "The value of C_SHEET_NAME_PREFIX must have between 1 and 29 characters." & vbCrLf & _
"The current length of C_SHEET_NAME_PREFIX is " & CStr(Len(C_SHEET_NAME_PREFIX)) & " characters."
Exit Sub
End If
'''''''''''''''''''''''''''''''''''''''''''''''''''
' Ensure C_START_SHEET_NAME refers to an existing
' sheet.
'''''''''''''''''''''''''''''''''''''''''''''''''''
On Error Resume Next
Err.Clear
Set WS = ActiveWorkbook.Worksheets(C_START_SHEET_NAME)
If Err.Number <> 0 Then
MsgBox "The sheet named in C_START_SHEET_NAME (" & C_START_SHEET_NAME & ") does not exist" & vbCrLf & _
"or is not a worksheet (e.g., it is a chart sheet).", vbOKOnly
Exit Sub
End If
''''''''''''''''''''''''''''''''''''''''''''''''''
' Ensure that if C_TEMPLATE_SHEET_NAME is not
' vbNullString, it names an existing sheet, and
' that it is not equal to C_START_SHEET_NAME.
''''''''''''''''''''''''''''''''''''''''''''''''''
On Error Resume Next
Err.Clear
If C_TEMPLATE_SHEET_NAME <> vbNullString Then
Set WS = ActiveWorkbook.Worksheets(C_TEMPLATE_SHEET_NAME)
If Err.Number <> 0 Then
MsgBox "The template sheet '" & C_TEMPLATE_SHEET_NAME & "' does not exist or is not a worksheet."
Exit Sub
End If
''''''''''''''''''''''''''''''''''''''''''''''''''
' Ensure that C_START_SHEET_NAME does not equal
' C_TEMPLATE_SHEET_NAME
''''''''''''''''''''''''''''''''''''''''''''''''''
If C_TEMPLATE_SHEET_NAME = C_START_SHEET_NAME Then
MsgBox "The C_TEMPLATE_SHEET_NAME is equal to the C_START_SHEET_NAME." & vbCrLf & _
"This is not allowed."
Exit Sub
End If
End If
On Error GoTo 0
''''''''''''''''''''''''''''''''''''''''''''''''''''
' We may have changed the worksheet referenced by WS
' when testing if C_TEMPLATE_SHEET_NAME exists. Reset
' WS back to C_START_SHEET_NAME.
''''''''''''''''''''''''''''''''''''''''''''''''''''
Set WS = ActiveWorkbook.Worksheets(C_START_SHEET_NAME)
''''''''''''''''''''''''''''''''''''''''''''''''''
' Ensure that none of the following are protected:
' C_START_SHEET_NAME
' C_TEMPLATE_SHEET_NAME
' ActiveWorkbook
''''''''''''''''''''''''''''''''''''''''''''''''''
If WS.ProtectContents = True Then
MsgBox "The worksheet '" & WS.Name & "' is protected."
Exit Sub
End If
If C_TEMPLATE_SHEET_NAME <> vbNullString Then
If ActiveWorkbook.Worksheets(C_TEMPLATE_SHEET_NAME).ProtectContents = True Then
MsgBox "The Template Sheet (" & C_TEMPLATE_SHEET_NAME & ") is protected."
Exit Sub
End If
End If
If ActiveWorkbook.ProtectStructure = True Then
MsgBox "The ActiveWorkbook is protected."
Exit Sub
End If
''''''''''''''''''''''''''''''''''''''''''''''
' Prompt the user for a TXT or CSV file
''''''''''''''''''''''''''''''''''''''''''''''
FName = Application.GetOpenFilename(FileFilter:="Text Files (*.txt),*.txt," & _
"CSV Files (*.csv),*.csv")
If FName = False Then
' user clicked CANCEL. get out now.
Exit Sub
End If
''''''''''''''''''''''''''''''''''''''''''''''
' Set our starting destination worksheet.
' Error and exit if sheet does not exist
''''''''''''''''''''''''''''''''''''''''''''''
On Error Resume Next
Set WS = ActiveWorkbook.Worksheets(C_START_SHEET_NAME)
If WS Is Nothing Then
MsgBox "The worksheet specified in C_START_SHEET_NAME (" & _
C_START_SHEET_NAME & ") does not exist."
Exit Sub
End If
On Error GoTo 0
'''''''''''''''''''''''''''''''''''''''''''''
' See if the file FName is open by another
' process. If it is, exit the procedure.
'''''''''''''''''''''''''''''''''''''''''''''
'If IsFileOpen(Filename:=CVar(FName)) = True Then
' MsgBox "The file '" & FName & "' is open by another process."
' Exit Sub
'End If
''''''''''''''''''''''''''''''''''''''''''''''
' Save the calculation mode and the ScreenUpdating
' mode. Set calculation to manual and turn off
' ScreenUpdating. This will greatly improve
' the performance of the code.
''''''''''''''''''''''''''''''''''''''''''''''
SaveCalc = Application.Calculation
SaveDisplayAlerts = Application.DisplayAlerts
SaveScreenUpdating = Application.ScreenUpdating
SaveEnableEvents = Application.EnableEvents
Application.DisplayAlerts = False
Application.Calculation = xlCalculationManual
Application.ScreenUpdating = False
Application.EnableEvents = False
''''''''''''''''''''''''''''''''''''''''''''''
' Get a file number and open the file
''''''''''''''''''''''''''''''''''''''''''''''
On Error Resume Next
FNum = FreeFile
Err.Clear
Open FName For Input Access Read As #FNum
If Err.Number <> 0 Then
'''''''''''''''''''''''''''''''''''''''''
' If an error occurred, alert the user,
' restore application settings, and
' exit the procedure.
'''''''''''''''''''''''''''''''''''''''''
MsgBox "An error occurred opening file '" & FName & "'." & vbCrLf & _
"Error Number: " & CStr(Err.Number) & vbCrLf & _
"Description: " & Err.Description
Close #FNum
Application.Calculation = SaveCalc
Application.ScreenUpdating = SaveScreenUpdating
Application.DisplayAlerts = SaveDisplayAlerts
Application.EnableEvents = SaveEnableEvents
Exit Sub
End If
On Error GoTo 0
''''''''''''''''''''''''''''''''''''''''''''''
' Initialize the RowNd index variable to
' C_START_ROW_FIRST_PAGE. This constant
' is used to preserve any header rows that
' may be present.
''''''''''''''''''''''''''''''''''''''''''''''
RowNdx = C_START_ROW_FIRST_PAGE
''''''''''''''''''''''''''''''''''''''''''''''
' Ensure that SplitChar is a single character.
''''''''''''''''''''''''''''''''''''''''''''''
If SplitChar <> vbNullString Then
SplitChar = Left(SplitChar, 1)
End If
''''''''''''''''''''''''''''''''''''''''''''''
' If LastRowForInput is <= 0, then set it
' to Rows.Count
''''''''''''''''''''''''''''''''''''''''''''''
If LastRowForInput <= 0 Then
LastRowForInput = WS.Rows.Count
End If
''''''''''''''''''''''''''''''''''''''''
' If MaxRowsPerSheet is <= 0, use Rows.Count
''''''''''''''''''''''''''''''''''''''''
If MaxRowsPerSheet <= 0 Then
MaxRowsPerSheet = Rows.Count
End If
''''''''''''''''''''''''''''''''''''''''''''''
' Loop until we hit the end of the file.
''''''''''''''''''''''''''''''''''''''''''''''
On Error GoTo 0
Do Until EOF(FNum)
''''''''''''''''''''''''''''''''''''''''''''''
' Get the next line of data from the file
''''''''''''''''''''''''''''''''''''''''''''''
Line Input #FNum, InputLine
''''''''''''''''''''''''''''''''''''''''''
' Increment counters.
''''''''''''''''''''''''''''''''''''''''''
InputCounter = InputCounter + 1
RowsThisSheet = RowsThisSheet + 1
''''''''''''''''''''''''''''''''''''''''''
' Determine whether to update the StatusBar.
''''''''''''''''''''''''''''''''''''''''''
If C_UPDATE_STATUSBAR_EVERY_N_RECORDS > 0 Then
If InputCounter Mod C_UPDATE_STATUSBAR_EVERY_N_RECORDS = 0 Then
Application.StatusBar = C_STATUSBAR_TEXT & _
Format(InputCounter, "#,##0")
End If
End If
If SplitChar = vbNullString Then
''''''''''''''''''''''''''''''''''''''
' We're not spliting up the input. Put
' the entire line in column C_START_COLUMN
''''''''''''''''''''''''''''''''''''''
WS.Cells(RowNdx, C_START_COLUMN).Value = InputLine
Else
''''''''''''''''''''''''''''''''''''''''
' SplitChar is not vbNullString.
' We're spliting up the input into columns.
' Use Split to get an array of the items
' in InputLine, delimited by SplitChar,
' and then loop through the Arr array, putting
' each element in its own column
''''''''''''''''''''''''''''''''''''''''
Arr = Split(expression:=InputLine, delimiter:=SplitChar, limit:=-1, compare:=vbTextCompare)
For Colndx = LBound(Arr) To UBound(Arr)
'''''''''''''''''''''''''''''''''''''''''''''''''''''
' Ensure we don't try to write past the last column
' of the worksheet. If we reach the last column,
' exit out of the For loop.
'''''''''''''''''''''''''''''''''''''''''''''''''''''
If Colndx + C_START_COLUMN <= WS.Columns.Count Then
WS.Cells(RowNdx, Colndx + C_START_COLUMN).Value = Arr(Colndx)
Else
TruncatedCount = TruncatedCount + 1
Exit For
End If
Next Colndx
End If ' SplitChar = vbNullString
'''''''''''''''''''''''''''''''''''''''
' Increment the RowNdx index variable.
' If it is greater than either of the following:
' Rows.Count
' LastRowForInput
' or if RowsThisSheet is > MaxRowsPerSheet
' then create and name a new worksheet and
' reset the RowNdx index variable to
' C_START_ROW_LATER_PAGES.
''''''''''''''''''''''''''''''''''''''''
RowNdx = RowNdx + 1
If (RowNdx > Rows.Count) Or (RowNdx > LastRowForInput) Or (RowsThisSheet > MaxRowsPerSheet) Then
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' We're past the end of the worksheet or past the row
' specified in LastRowForInput or the rows used on this
' worksheet is greater than MaxRowsPerSheet.
'
' Increment the SheetNumber index and either create a
' new sheet (if C_TEMPLATE_SHEET_NAME is vbNullString) or
' copy the C_TEMPLATE_SHEET_NAME worksheet
' immediately after the current sheet, and name it
' C_SHEET_NAME_PREFIX & Format(SheetNumber, "0")
' Reset the RowNdx value to C_START_ROW_LATER_PAGE
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''
SheetNumber = SheetNumber + 1
If C_TEMPLATE_SHEET_NAME = vbNullString Then
Set WS = ActiveWorkbook.Worksheets.Add(after:=WS)
Else
ActiveWorkbook.Worksheets(C_TEMPLATE_SHEET_NAME).Copy after:=WS
Set WS = ActiveWorkbook.ActiveSheet
End If
On Error Resume Next
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' ignore the error that might arise if there is already a
' sheet named
' C_SHEET_NAME_PREFIX & Format(SheetNumber, "0")
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
WS.Name = C_SHEET_NAME_PREFIX & Format(SheetNumber, "0")
On Error GoTo 0
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' Reset out counters.
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''
RowNdx = C_START_ROW_LATER_PAGES
RowsThisSheet = 0
End If
''''''''''''''''''''''''''''''''''''''''''''''
' end of Read loop
''''''''''''''''''''''''''''''''''''''''''''''
Loop
''''''''''''''''''''''''''''''''''''''''''''''
' Close the input file and restore the saved
' application settings.
''''''''''''''''''''''''''''''''''''''''''''''
Close FNum
Application.Calculation = SaveCalc
Application.ScreenUpdating = SaveScreenUpdating
Application.DisplayAlerts = SaveDisplayAlerts
Application.EnableEvents = SaveEnableEvents
Application.StatusBar = False
''''''''''''''''''''''''''''''''''''''''''''''
' MsgBox to the user indicating we're done.
''''''''''''''''''''''''''''''''''''''''''''''
MsgBox "Import operation from file '" & FName & "' complete." & vbCrLf & _
"Records Imported: " & Format(InputCounter, "#,##0") & vbCrLf & _
"Records Truncated: " & Format(TruncatedCount, "#,##0"), _
vbOKOnly, "Import Text File"
''''''''''''''''''''''
' END OF PROCEDURE
''''''''''''''''''''''
End Sub