A Different Twist on a Massive Import

Boogamil

New Member
Joined
Aug 13, 2003
Messages
26
Good Afternoon,

I have a huge .lis file that I've saved as a txt file and I would like to import this into excel. Now I've found some code that allows me to import this over multiple sheets, which is perfect. However, everything gets dumped into the first column.

Part of the problem is that the file has row headings every page.

What I would like to do is delete the headings, import this massive beast into excel and have the columns separated appropriately. A space separates the data within the txt file to define what is a new column. The headings are defined as not beginning with a number.

This is the code that I found on the site already for the importation.

Please let me know if I haven't explained my situation properly. Thank you in advance for your time and help.

Regards,

Chris

-------------------------------------------------------------------------

Sub ImportLargeFile()
'Imports text file into Excel workbook using ADO.
'If the number of records exceeds 65536 then it splits it over more than one sheet.

Dim strFilePath As String, strFilename As String, strFullPath As String
Dim lngCounter As Long
Dim oConn As Object, oRS As Object, oFSObj As Object

'Get a text file name
strFullPath = Application.GetOpenFilename("Text Files (*.txt),*.txt", , "Please selec text file...")

If strFullPath = "False" Then Exit Sub 'User pressed Cancel on the open file dialog

'This gives us a full path name e.g. C:tempfolderfile.txt
'We need to split this into path and file name
Set oFSObj = CreateObject("SCRIPTING.FILESYSTEMOBJECT")

strFilePath = oFSObj.GetFile(strFullPath).ParentFolder.Path
strFilename = oFSObj.GetFile(strFullPath).Name


'Open an ADO connection to the folder specified
Set oConn = CreateObject("ADODB.CONNECTION")
oConn.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & strFilePath & ";" & _
"Extended Properties=""text;HDR=Yes;FMT=Delimited"""

Set oRS = CreateObject("ADODB.RECORDSET")

'Now actually open the text file and import into Excel
oRS.Open "SELECT * FROM " & strFilename, oConn, 3, 1, 1
While Not oRS.EOF
Sheets.Add
ActiveSheet.Range("A1").CopyFromRecordset oRS, 65536
Wend

oRS.Close
oConn.Close

End Sub
 

Excel Facts

Why are there 1,048,576 rows in Excel?
The Excel team increased the size of the grid in 2007. There are 2^20 rows and 2^14 columns for a total of 17 billion cells.
Here is code that you can use... the difference is that it reads in the lines one at a time from your original file. It will delete all headers (including the first header) and any blank lines. Also, you do not need to rename your .lis file to .txt.

Code:
Public Sub import_file()
On Error GoTo import_file_err

    Dim fso As Variant, f As Variant, ts As Variant, arVar As Variant
    Dim intSheet As Integer
    Dim lngRow As Long
    Dim fname As String, s As String
        
    Set fso = CreateObject("Scripting.FileSystemObject")
    
    'Get lis file
    fname = Application.GetOpenFilename("LIS Files (*.lis), *.lis", , "SELECT FILE")
    If fname = "False" Then GoTo normal_exit
      
    Set f = fso.GetFile(fname)
    Set ts = f.OpenAsTextStream(1, 0)
    
    Application.ScreenUpdating = False
    intSheet = 1
    lngRow = 1
    
    Do
        'get next line of data from lis file
        If Not ts.AtEndOfStream Then s = ts.readline
        
        'fields seperated by space
        arVar = Split(s, " ", -1, vbTextCompare)
        
        'if line has data paste data to row in excel
        If UBound(arVar) > 0 Then
            'do not add headers
            If IsNumeric(Left(arVar(0), 1)) Then
                Worksheets(intSheet).Range(Cells(lngRow, 1), Cells(lngRow, UBound(arVar) + 1)).Value = arVar
                lngRow = lngRow + 1
            
                'if hit last row in excel start on first row of next sheet
                If lngRow > 65536 Then
                    lngRow = 1
                    intSheet = intSheet + 1
                    Worksheets(intSheet).Activate
                End If
            End If
        End If
    Loop While ts.AtEndOfStream <> True
        
normal_exit:
On Error Resume Next
    ts.Close
    Application.ScreenUpdating = True
    
    'activate first worksheet and select A1
    Worksheets(1).Activate
    Cells(1, 1).Select
    Exit Sub

import_file_err:
    MsgBox Error$
    Resume normal_exit
    
End Sub
 
Upvote 0
wongm003,

Thank you very much for the response. I've attempting to run the code and it just hangs. Using the previous code, which didn't work exactly for my scenario, it took about 7 minutes to run. When I ran the code with your changes, I let it run for an hour and no results.

It asks me for the file name, but then states that it is importing the first row and this is all.

Any help would be appreciated.

BTW; I thought I had to rename a .lis file as a .txt file, but thank you for the clarification it will save me some time.
 
Upvote 0
Chris,

About how many rows are you talking about??? The previous code used ADO to read in the external file (usually faster)... my code is reading in the file line by line so there will be a difference in response time. Reading in line by line allows you to filter out lines you don't want (header) but does increase response time and depending on the size/os/machine could be a noticeable difference.

When I told you that you could use the .lis file it was because I was using the file system object to open the .lis file as a text stream. Using the .lis with ADO requires you to have it set in the registry.

Have you thought about using Access to import your file to?? It would be easier to generate reports/query.

That being said... here is new code that uses the file system object to read in the header row (assuming header is on the first line) and then using ADO to get the rest of the data.

*** NOTE to use ADO with the lis file this code will modify the registry and change it back when completed. ***

Code:
Public Sub import_file()
'======================================================
' Author: wongm003
' Date: 10/16/06
' Use: use to import space delimited file that contain
'      numeric values.
' NOTE: THIS CODE WILL CHANGE REGISTRY VALUES NOT RESPONSIBLE
'       FOR ANY MISUSE OR DAMAGE.
'======================================================
On Error GoTo import_file_err

    Dim varReg As Variant, varFSO As Variant
    Dim varFile As Variant, varTS As Variant
    Dim varConn As Variant, varRS As Variant
    Dim intSheet As Integer, intWS As Integer
    Dim strOrigDisExt As String, strOrigFormat As String, strValue As String
    Dim strFullPath As String, strFilePath As String, strFileName As String
    
    'turn off screen updating
    Application.ScreenUpdating = False
    
    'Default Reg Values                usual values
    strOrigDisExt = ""                '!txt,csv,tab,asc,tmp,htm,html
    strOrigFormat = ""                'CSVDelimited
         
    'setup registry variables
    Const HKEY_LOCAL_MACHINE = &H80000002
    Const strComputer As String = "."
    Const strKeyPath As String = "SOFTWARE\Microsoft\Jet\4.0\Engines\Text"
    Set varReg = GetObject("winmgmts:\\" & strComputer & _
        "\root\default:StdRegProv")

    'get original registry values
    varReg.GetStringValue HKEY_LOCAL_MACHINE, strKeyPath, "DisabledExtensions", strOrigDisExt
    varReg.GetStringValue HKEY_LOCAL_MACHINE, strKeyPath, "Format", strOrigFormat

    'Get file
    strFullPath = Application.GetOpenFilename("Import Files (*.lis;*.txt), *.lis;*.txt", , "SELECT FILE")
    If strFullPath = "False" Then GoTo normal_exit

    'open file to get field names... note field names need to be on first row
    Set varFSO = CreateObject("Scripting.FileSystemObject")
    Set varFile = varFSO.GetFile(strFullPath)
    Set varTS = varFile.OpenAsTextStream(1, 0)
    If Not varTS.AtEndOfStream Then
        s = varTS.readline
        varTS.Close
    Else
        GoTo normal_exit
    End If

    'get field names
    arFlds = Split(s, " ", -1, vbTextCompare)

    'change registry values
    'add .lis to disabled extensions
    If Not strOrigDisExt Like "*,lis*" Then
        strValue = strOrigDisExt & ",lis"
        varReg.SetStringValue HKEY_LOCAL_MACHINE, strKeyPath, "DisabledExtensions", strValue
    End If

    'change format to space delimited
    strValue = "Delimited( )"
    varReg.SetStringValue HKEY_LOCAL_MACHINE, strKeyPath, "Format", strValue

    'parse full path
    strFilePath = varFSO.GetFile(strFullPath).ParentFolder.Path
    strFileName = varFSO.GetFile(strFullPath).Name
    
    'Open an ADO connection to the folder specified
    Set varConn = CreateObject("ADODB.Connection")
    Set varRS = CreateObject("ADODB.Recordset")
    
    varConn.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
    "Data Source=" & strFilePath & ";" & _
    "Extended Properties=""text;HDR=Yes;FMT=Delimited"""
    
    'Now actually open the text file and import into Excel
    varRS.Open "SELECT * FROM " & strFileName & " WHERE IsNumeric(" & arFlds(0) & ")=True", varConn, 3, 3, &H1

    intSheet = 0

    While Not varRS.EOF
        Sheets.Add
        intSheet = intSheet + 1
        ActiveSheet.Name = "DATA" & intSheet
        If intSheet > 1 Then Sheets("DATA" & intSheet).Move After:=Sheets("DATA" & (intSheet - 1))
        ActiveSheet.Range(Cells(1, 1), Cells(1, UBound(arFlds) + 1)).Value = arFlds
        ActiveSheet.Range("A2").CopyFromRecordset varRS, 65535
    Wend

    If intSheet = 0 Then GoTo normal_exit

    'Delete unused sheets
    Application.DisplayAlerts = False
    For intWS = 1 To ActiveWorkbook.Worksheets.Count
        If Not Sheets(intWS).Name Like "DATA*" Then Sheets(intWS).Delete
    Next intWS
    
normal_exit:
On Error Resume Next
    'restore registry if needed
    If strOrigDisExt <> "" Then varReg.SetStringValue HKEY_LOCAL_MACHINE, strKeyPath, "DisabledExtensions", strOrigDisExt
    If strOrigFormat <> "" Then varReg.SetStringValue HKEY_LOCAL_MACHINE, strKeyPath, "Format", strOrigFormat

    varTS.Close
    varRS.Close
    varConn.Close
    
    Sheets("DATA1").Activate
    Cells(1, 1).Select

    Application.DisplayAlerts = True
    Application.ScreenUpdating = True
    Exit Sub

import_file_err:
    MsgBox Error$
    Resume normal_exit
    
End Sub
 
Upvote 0
Access was my first thought, but it may actually be too large for Access.

:eek:

My intention is to take this file and bcp it into our database, where I can do a proper query through sybase. In the end I'll probably have to see if they can upgrade Access from 2000 to XP, as it allows for double the size for a db. Or see if I can run this from my home pc. I can also let your original code run overnight and see if it finishes by morning.

At any rate thank you for your time, you've given me a lot to think about. I'll let you know if it finishes overnight however.
 
Upvote 0

Forum statistics

Threads
1,213,536
Messages
6,114,213
Members
448,554
Latest member
Gleisner2

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top