600 MB text files to Access - how to do it quickly?

Jaymond Flurrie

Well-known Member
Joined
Sep 22, 2008
Messages
921
Office Version
  1. 365
Platform
  1. Windows
So the idea is that I have now split up my 6GB flatfile into ten smaller ones, they're just text files with one ID number, four 1-10 numbers and 2400 0/1s separated by tabs there. As an example data (using only four of those 2400):

ID One Two Three A B C D
1 4 5 6 0 1 0 1
12 5 6 7 1 1 0 1

I would need to get these to Access as quickly as possible and I already have VBA code to create tables with Integer-Integer-Integer-Integer-BIT-BIT-BIT-BIT fields, but what's the smartest way to get the data there?

Let's assume one of my 600MB files is in C:\fromhere1.txt

Also, would it be faster/better if I would use ACCDB instead of MDB as my databases for this kind of data?
 

Excel Facts

Excel Can Read to You
Customize Quick Access Toolbar. From All Commands, add Speak Cells or Speak Cells on Enter to QAT. Select cells. Press Speak Cells.
Hi,

I don't think it would make a big difference with mdb or accdb formats. Take your pick.

I haven't gotten around to thinking how to do this properly, but I did just create a script to create some test data to play with - a file with 1000 columns as you describe. Perhaps someone else will also find this useful if they'd like to give it a shot at the imports.

I'm not sure that "quickest way possible" is really important - unless you are doing it every day with a new set of test data (?). Is that the case? I'm sure any script will be much faster than manually massaging the data for days.

TEST DATA script:
Code:
Sub MakeTestFile()

Dim FSO As Object
Dim ts As Object
Dim s(999) As String
Dim t As String
Dim intID As Long

    On Error GoTo ErrHandler:
    Set FSO = CreateObject("Scripting.FileSystemObject")
    With FSO
        Set ts = FSO.CreateTextFile("C:\Documents and Settings\USERNAME\Desktop\test_data.txt", True)
        For i = 0 To 999
            s(i) = "F" & i
        Next i
        t = Join(s, vbTab)
        ts.WriteLine t
            
        For j = 0 To 9
            For i = 0 To 999
                If i < 1 Then
                    s(i) = intID + 1
                    intID = intID + 1
                ElseIf i < 4 Then
                    s(i) = Int(Rnd() * 10)
                Else
                    s(i) = IIf(Rnd() > 0.5, 0, 1)
                End If
            Next i
            t = Join(s, vbTab)
            ts.WriteLine t
        Next j
    End With
    ts.Close
    Set ts = Nothing

My_Exit:
If Not ts Is Nothing Then
    ts.Close
    Set ts = Nothing
End If
Set FSO = Nothing
Exit Sub

ErrHandler:
MsgBox "Error: " & Err.Description
Resume My_Exit
End Sub
 
Last edited:
Upvote 0
But this is now only creating the text file, right? I actually have the files already, it's just that I need to import those to Access.

The best way I've figured out so far would be to read a line, use VBA.Split and use "INSERT INTO" against this array of ~2400 elements that I got as the result of the split. Maybe even better would be to create bigger arrays, say, 500x2400 and "INSERT INTO" those 500 rows at a time there.

But is this the smartest way?
 
Upvote 0
Have you found that there is no 255 column limit in Access now? I though that was the real problem.

BTW the test data was not for your benefit, since you have test data already. It is for the benefit of others who might like to try to work this out. I should have thought that was fairly obvious.

ξ
 
Upvote 0
Have you found that there is no 255 column limit in Access now? I though that was the real problem.

BTW the test data was not for your benefit, since you have test data already. It is for the benefit of others who might like to try to work this out. I should have thought that was fairly obvious.

ξ

I'll go around the 255 column limit by using several tables.
 
Upvote 0
Well, this is a poor attempt but I gave it a shot. My thought was to create 9 files that each would be uploaded to a table the "normal" way - through the Access import menu. In short, the first file has the first 255 columns, the second has the next 255 columns, and so on. But I assumed the ID's and the numbers in columns 2-4 were keys of some sort, so they are found in each file (hence, in each table).

It does work on my test data, but I didn't bother with 6,000,000 rows - too long to wait. I reckon it would be something like an hour or two to parse out all 9 files of 6,000,000 rows each. I'd test everything before going all the way on this one.

Workbook is downloadable here, or use the code below (it uses a filepath entered in a textbox on a worksheet - easy enough to resolve if you don't want to download the workbook).

I am absolutely sure there's better ways. Though it seems the obvious solution is to use a database engine that supports more than 255 columns - it would be as simple as loading the data from the text file, in that case. Plus the whole thing is screaming for a command line script to parse the file more efficiently but I'm sorely lacking there ... {sigh} ... must get some more fundamentals.

WORKBOOK (ZIP DOWNLOAD)



CODE FOR TEST DATA:
Code:
[COLOR="Navy"]Sub[/COLOR] MakeTestFile1()
[COLOR="SeaGreen"]'//One only needs this macro if test data is required[/COLOR]
[COLOR="Navy"]Dim[/COLOR] FSO [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Object[/COLOR]
[COLOR="Navy"]Dim[/COLOR] ts [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Object[/COLOR]
[COLOR="Navy"]Dim[/COLOR] s(2399) [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]String[/COLOR]
[COLOR="Navy"]Dim[/COLOR] t [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]String[/COLOR]
[COLOR="Navy"]Dim[/COLOR] intID [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Dim[/COLOR] i [COLOR="Navy"]As[/COLOR] Long, j [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Dim[/COLOR] sFilePath [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]String[/COLOR]
[COLOR="Navy"]Dim[/COLOR] n [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]

    [COLOR="Navy"]On[/COLOR] [COLOR="Navy"]Error[/COLOR] [COLOR="Navy"]GoTo[/COLOR] ErrHandler:
    [COLOR="Navy"]Set[/COLOR] FSO = CreateObject("Scripting.FileSystemObject")
    n = 1000 [COLOR="SeaGreen"]'//number of fows to write[/COLOR]
    
    [COLOR="Navy"]With[/COLOR] FSO
        
        sFilePath = Sheet1.TextBox1.Value
        [COLOR="Navy"]Set[/COLOR] ts = FSO.CreateTextFile(sFilePath, True)
        
        [COLOR="Navy"]For[/COLOR] i = 0 [COLOR="Navy"]To[/COLOR] 2399
            s(i) = "F" & i
        [COLOR="Navy"]Next[/COLOR] i
        t = Join(s, vbTab)
        ts.WriteLine t
            
        [COLOR="Navy"]For[/COLOR] j = 0 [COLOR="Navy"]To[/COLOR] n
            [COLOR="Navy"]If[/COLOR] j [COLOR="Navy"]Mod[/COLOR] 100 = 0 [COLOR="Navy"]Then[/COLOR]
                Sheet1.Range("B20").Value = "Writing rows ... " & Format(j, "#,##0")
                DoEvents
            [COLOR="Navy"]End[/COLOR] [COLOR="Navy"]If[/COLOR]
            [COLOR="Navy"]For[/COLOR] i = 0 [COLOR="Navy"]To[/COLOR] 2399
                [COLOR="Navy"]If[/COLOR] i < 1 [COLOR="Navy"]Then[/COLOR]
                    s(i) = intID + 1
                    intID = intID + 1
                [COLOR="Navy"]ElseIf[/COLOR] i < 4 [COLOR="Navy"]Then[/COLOR]
                    s(i) = Int(Rnd() * 10)
                [COLOR="Navy"]Else[/COLOR]
                    s(i) = IIf(Rnd() > 0.5, 0, 1)
                [COLOR="Navy"]End[/COLOR] [COLOR="Navy"]If[/COLOR]
            [COLOR="Navy"]Next[/COLOR] i
            t = Join(s, vbTab)
            ts.WriteLine t
        [COLOR="Navy"]Next[/COLOR] j
    
    [COLOR="Navy"]End[/COLOR] [COLOR="Navy"]With[/COLOR]
    ts.Close
    [COLOR="Navy"]Set[/COLOR] ts = [COLOR="Navy"]Nothing[/COLOR]

My_Exit:
[COLOR="Navy"]If[/COLOR] [COLOR="Navy"]Not[/COLOR] ts [COLOR="Navy"]Is[/COLOR] [COLOR="Navy"]Nothing[/COLOR] [COLOR="Navy"]Then[/COLOR]
    ts.Close
    [COLOR="Navy"]Set[/COLOR] ts = [COLOR="Navy"]Nothing[/COLOR]
[COLOR="Navy"]End[/COLOR] [COLOR="Navy"]If[/COLOR]
[COLOR="Navy"]Set[/COLOR] FSO = [COLOR="Navy"]Nothing[/COLOR]
Sheet1.Range("B20").Value = ""


[COLOR="Navy"]Exit[/COLOR] [COLOR="Navy"]Sub[/COLOR]

ErrHandler:
MsgBox "Error: " & Err.Description
[COLOR="Navy"]Resume[/COLOR] My_Exit
[COLOR="Navy"]End[/COLOR] [COLOR="Navy"]Sub[/COLOR]

CODE TO CREATE 9 FILES of 255 COLUMNS:
Code:
[COLOR="Navy"]Sub[/COLOR] SplitFile1()

[COLOR="Navy"]Dim[/COLOR] FSO [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Object[/COLOR]
[COLOR="Navy"]Dim[/COLOR] ts(1) [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Object[/COLOR]
[COLOR="Navy"]Dim[/COLOR] a() [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]String[/COLOR]
[COLOR="Navy"]Dim[/COLOR] s() [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]String[/COLOR]
[COLOR="Navy"]Dim[/COLOR] intID [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Dim[/COLOR] hdrs() [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]String[/COLOR]
[COLOR="Navy"]Dim[/COLOR] i [COLOR="Navy"]As[/COLOR] Long, j [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Dim[/COLOR] iTemp [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Dim[/COLOR] n [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Dim[/COLOR] intCount [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Dim[/COLOR] intLineCount [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Dim[/COLOR] f [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Object[/COLOR]
[COLOR="Navy"]Dim[/COLOR] sFilePath [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]String[/COLOR]
    
    
    [COLOR="Navy"]On[/COLOR] [COLOR="Navy"]Error[/COLOR] [COLOR="Navy"]GoTo[/COLOR] ErrHandler:
    [COLOR="Navy"]Set[/COLOR] FSO = CreateObject("Scripting.FileSystemObject")
    sFilePath = Sheet1.TextBox1.Value
    
    [COLOR="Navy"]With[/COLOR] FSO
        
        [COLOR="SeaGreen"]'//Clean up previous runs files ... (may be deleted after testing)[/COLOR]
        [COLOR="Navy"]For[/COLOR] [COLOR="Navy"]Each[/COLOR] f [COLOR="Navy"]In[/COLOR] .GetFolder(.GetFile(sFilePath).ParentFolder).Files
            [COLOR="Navy"]If[/COLOR] Left(f.Name, 7) = "Upload_" [COLOR="Navy"]Then[/COLOR]
                f.Delete
            [COLOR="Navy"]End[/COLOR] [COLOR="Navy"]If[/COLOR]
        [COLOR="Navy"]Next[/COLOR] f
        
        n = 2399 [COLOR="SeaGreen"]'//total number of columns[/COLOR]
        j = 4
        [COLOR="Navy"]Do[/COLOR] [COLOR="Navy"]While[/COLOR] j < n
            [COLOR="Navy"]Set[/COLOR] ts(0) = .OpenTextFile(sFilePath)
            [COLOR="Navy"]Set[/COLOR] ts(1) = .CreateTextFile(.GetFile(sFilePath).ParentFolder & "\" & "Upload_" & Format(j, "0000") & ".txt", True)
            intCount = intCount + 1
            [COLOR="Navy"]ReDim[/COLOR] a(0 [COLOR="Navy"]To[/COLOR] LastUBound(intCount, n))
            hdrs = Split(ts(0).ReadLine, vbTab)
            a(0) = hdrs(0)
            a(1) = hdrs(1)
            a(2) = hdrs(2)
            a(3) = hdrs(3)
            i = 4
            iTemp = j
            [COLOR="Navy"]Do[/COLOR] [COLOR="Navy"]While[/COLOR] i <= UBound(a)
                a(i) = hdrs(j)
                i = i + 1
                j = j + 1
            [COLOR="Navy"]Loop[/COLOR]
            ts(1).WriteLine Join(a, vbTab)
            j = iTemp
            
            intLineCount = 1
            [COLOR="Navy"]Do[/COLOR] [COLOR="Navy"]While[/COLOR] ts(0).AtEndOfStream = False
                s = Split(ts(0).ReadLine, vbTab)
                a(0) = s(0)
                a(1) = s(1)
                a(2) = s(2)
                a(3) = s(3)
                i = 4
                j = iTemp
                [COLOR="Navy"]Do[/COLOR] [COLOR="Navy"]While[/COLOR] i <= UBound(a)
                    a(i) = s(j)
                    i = i + 1
                    j = j + 1
                [COLOR="Navy"]Loop[/COLOR]
                ts(1).WriteLine Join(a, vbTab)
                intLineCount = intLineCount + 1
                [COLOR="Navy"]If[/COLOR] intLineCount [COLOR="Navy"]Mod[/COLOR] 100 = 0 [COLOR="Navy"]Then[/COLOR]
                    Sheet1.Range("B20").Value = "Writing Rows ... File " & Format(intCount, "#,##0") & ", Row " & Format(intLineCount, "#,##0")
                    DoEvents
                [COLOR="Navy"]End[/COLOR] [COLOR="Navy"]If[/COLOR]
            [COLOR="Navy"]Loop[/COLOR]
            ts(1).Close
            ts(0).Close
        [COLOR="Navy"]Loop[/COLOR]
        ts(0).Close
        [COLOR="Navy"]Set[/COLOR] ts(0) = [COLOR="Navy"]Nothing[/COLOR]
            
    [COLOR="Navy"]End[/COLOR] [COLOR="Navy"]With[/COLOR]
    
My_Exit:
[COLOR="Navy"]For[/COLOR] i = 0 [COLOR="Navy"]To[/COLOR] 1
    [COLOR="Navy"]If[/COLOR] [COLOR="Navy"]Not[/COLOR] ts(i) [COLOR="Navy"]Is[/COLOR] [COLOR="Navy"]Nothing[/COLOR] [COLOR="Navy"]Then[/COLOR]
        ts(i).Close
        [COLOR="Navy"]Set[/COLOR] ts(i) = [COLOR="Navy"]Nothing[/COLOR]
    [COLOR="Navy"]End[/COLOR] [COLOR="Navy"]If[/COLOR]
[COLOR="Navy"]Next[/COLOR] i
[COLOR="Navy"]Set[/COLOR] FSO = [COLOR="Navy"]Nothing[/COLOR]
Sheet1.Range("B20").Value = ""

[COLOR="Navy"]Exit[/COLOR] [COLOR="Navy"]Sub[/COLOR]

ErrHandler:
MsgBox "Error: " & Err.Description
[COLOR="Navy"]Resume[/COLOR] My_Exit
[COLOR="Navy"]End[/COLOR] [COLOR="Navy"]Sub[/COLOR]

[COLOR="Navy"]Private[/COLOR] [COLOR="Navy"]Function[/COLOR] LastUBound(ByRef intCount [COLOR="Navy"]As[/COLOR] Long, n [COLOR="Navy"]As[/COLOR] Long)
    [COLOR="Navy"]If[/COLOR] 251 * intCount + 4 > n [COLOR="Navy"]Then[/COLOR]
        LastUBound = n - (251 * (intCount - 1))
    [COLOR="Navy"]Else[/COLOR]
        LastUBound = 254
    [COLOR="Navy"]End[/COLOR] [COLOR="Navy"]If[/COLOR]
[COLOR="Navy"]End[/COLOR] [COLOR="Navy"]Function[/COLOR]
 
Last edited:
Upvote 0
I'm more than sure that you're way better with these than I am, so you can be sure that I'll seriously consider your solution over the one I have halfway created here. Thanks a lot for giving it a try!
 
Upvote 0
Well, I adapted my code to run updates from the text file to a database as we read the file - but found it was going to take 30 hours to complete 6,000,000 rows. Oops.

I had an interesting thought along the way - since your goal was to sum the columns, why not sum the columns right from the text file? That led me to this script, which doesn't do anything fancy at all and sums the 2400 columns in 5 minutes with 100,000 lines of test data (so it could handle 6,000,000 rows in less than an hour). I've assumed that the first 4 columns (ID and the three integer columns) don't need to be summed. The results (column name and column sum total) are output to a worksheet.

<a href="http://northernocean.net/etc/mrexcel/20110402_macrowb3.zip">SAMPLE WORKBOOK</a>
md5 hash for zip file: 37b5ec29911b891e476091428c73f4fd

Here's the main script for file processing:
Code:
[COLOR="Navy"]Public[/COLOR] [COLOR="Navy"]Sub[/COLOR] ColumnTotals()

[COLOR="Navy"]Dim[/COLOR] FSO [COLOR="Navy"]As[/COLOR] FileSystemObject
[COLOR="Navy"]Dim[/COLOR] ts(0) [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Object[/COLOR]
[COLOR="Navy"]Dim[/COLOR] a
[COLOR="Navy"]Dim[/COLOR] b() [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Dim[/COLOR] c
[COLOR="Navy"]Dim[/COLOR] intID [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Dim[/COLOR] i [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Dim[/COLOR] n [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Dim[/COLOR] intCount [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Dim[/COLOR] sTextFilePath [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]String[/COLOR]

    [COLOR="Navy"]On[/COLOR] [COLOR="Navy"]Error[/COLOR] [COLOR="Navy"]GoTo[/COLOR] ErrHandler:
    [COLOR="Navy"]Set[/COLOR] FSO = CreateObject("Scripting.FileSystemObject")
    sTextFilePath = Sheet1.TextBox1.Value
    Sheet1.Range("E20").Value = "Start Processing: " & Now
    Sheet1.Range("I20").Value = ""
    Sheet1.Cells(22, 2).Resize(2400, 2).ClearContents
    
    [COLOR="Navy"]With[/COLOR] FSO
        
        n = 2399 [COLOR="SeaGreen"]'//total number of columns (0 based counting)[/COLOR]
        [COLOR="Navy"]Set[/COLOR] ts(0) = .OpenTextFile(sTextFilePath)
        [COLOR="Navy"]ReDim[/COLOR] a(0 [COLOR="Navy"]To[/COLOR] n)
        [COLOR="Navy"]ReDim[/COLOR] b(0 [COLOR="Navy"]To[/COLOR] n)
        a = Split(ts(0).ReadLine, vbTab)
        
        intCount = 1
        [COLOR="Navy"]Do[/COLOR] [COLOR="Navy"]While[/COLOR] ts(0).AtEndOfStream = False
            c = Split(ts(0).ReadLine, vbTab)
            [COLOR="Navy"]For[/COLOR] i = 4 [COLOR="Navy"]To[/COLOR] n [COLOR="SeaGreen"]'//skipping first four columns[/COLOR]
                b(i) = b(i) + c(i)
            [COLOR="Navy"]Next[/COLOR] i
            intCount = intCount + 1
            [COLOR="Navy"]If[/COLOR] intCount [COLOR="Navy"]Mod[/COLOR] 500 = 0 [COLOR="Navy"]Then[/COLOR]
                [COLOR="Navy"]Debug[/COLOR].[COLOR="Navy"]Print[/COLOR] "Processing " & intCount
                Sheet1.Range("B20").Value = "Processing " & intCount
                DoEvents
            [COLOR="Navy"]End[/COLOR] [COLOR="Navy"]If[/COLOR]
        [COLOR="Navy"]Loop[/COLOR]
        ts(0).Close
        [COLOR="Navy"]Set[/COLOR] ts(0) = [COLOR="Navy"]Nothing[/COLOR]
        Sheet1.Cells(22, 2).Resize(2400, 1).Value = WorksheetFunction.Transpose(a)
        Sheet1.Cells(22, 3).Resize(2400, 1).Value = WorksheetFunction.Transpose(b)
    
    [COLOR="Navy"]End[/COLOR] [COLOR="Navy"]With[/COLOR]
    
My_Exit:
[COLOR="Navy"]If[/COLOR] [COLOR="Navy"]Not[/COLOR] ts(0) [COLOR="Navy"]Is[/COLOR] [COLOR="Navy"]Nothing[/COLOR] [COLOR="Navy"]Then[/COLOR]
    ts(0).Close
    [COLOR="Navy"]Set[/COLOR] ts(0) = [COLOR="Navy"]Nothing[/COLOR]
[COLOR="Navy"]End[/COLOR] [COLOR="Navy"]If[/COLOR]
[COLOR="Navy"]Set[/COLOR] FSO = [COLOR="Navy"]Nothing[/COLOR]
Sheet1.Range("B20").Value = ""
Sheet1.Range("I20").Value = "End Processing: " & Now

[COLOR="Navy"]Exit[/COLOR] [COLOR="Navy"]Sub[/COLOR]

ErrHandler:
MsgBox "Error: " & Err.Description
[COLOR="Navy"]Resume[/COLOR] My_Exit
[COLOR="Navy"]End[/COLOR] [COLOR="Navy"]Sub[/COLOR]


This is my script for creating a test file:
Code:
[COLOR="Navy"]Sub[/COLOR] MakeTestFile1()

[COLOR="Navy"]Dim[/COLOR] FSO [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Object[/COLOR]
[COLOR="Navy"]Dim[/COLOR] ts [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Object[/COLOR]
[COLOR="Navy"]Dim[/COLOR] s(2399) [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]String[/COLOR]
[COLOR="Navy"]Dim[/COLOR] t [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]String[/COLOR]
[COLOR="Navy"]Dim[/COLOR] intID [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Dim[/COLOR] i [COLOR="Navy"]As[/COLOR] Long, j [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Dim[/COLOR] sFilePath [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]String[/COLOR]
[COLOR="Navy"]Dim[/COLOR] num_cols [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Dim[/COLOR] num_rows [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Dim[/COLOR] intLineCount [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]

    num_cols = 2399 [COLOR="SeaGreen"]'//0 based counting - Columns[/COLOR]
    num_rows = 9999 [COLOR="SeaGreen"]'//0 based counting - Rows (as many as you like)[/COLOR]
    
    [COLOR="Navy"]On[/COLOR] [COLOR="Navy"]Error[/COLOR] [COLOR="Navy"]GoTo[/COLOR] ErrHandler:
    
    [COLOR="Navy"]Set[/COLOR] FSO = CreateObject("Scripting.FileSystemObject")
    [COLOR="Navy"]With[/COLOR] FSO
        
        sFilePath = Sheet1.TextBox1.Value
        [COLOR="Navy"]Set[/COLOR] ts = FSO.CreateTextFile(sFilePath, True)
        
        [COLOR="Navy"]For[/COLOR] i = 0 [COLOR="Navy"]To[/COLOR] num_cols
            s(i) = "F" & i
        [COLOR="Navy"]Next[/COLOR] i
        t = Join(s, vbTab)
        ts.WriteLine t
            
        [COLOR="Navy"]For[/COLOR] j = 0 [COLOR="Navy"]To[/COLOR] num_rows
            [COLOR="Navy"]For[/COLOR] i = 0 [COLOR="Navy"]To[/COLOR] num_cols
                [COLOR="Navy"]If[/COLOR] i < 1 [COLOR="Navy"]Then[/COLOR]
                    s(i) = j + 1
                [COLOR="Navy"]ElseIf[/COLOR] i < 4 [COLOR="Navy"]Then[/COLOR]
                    s(i) = Int(Rnd() * 10)
                [COLOR="Navy"]Else[/COLOR]
                    s(i) = IIf(Rnd() > 0.5, 0, 1)
                [COLOR="Navy"]End[/COLOR] [COLOR="Navy"]If[/COLOR]
            [COLOR="Navy"]Next[/COLOR] i
            t = Join(s, vbTab)
            ts.WriteLine t
            [COLOR="Navy"]If[/COLOR] j [COLOR="Navy"]Mod[/COLOR] 500 = 0 [COLOR="Navy"]Then[/COLOR]
                Sheet1.Range("B22").Value = "Writing Row ... " & Format(j, "#,##0")
                DoEvents
            [COLOR="Navy"]End[/COLOR] [COLOR="Navy"]If[/COLOR]
        [COLOR="Navy"]Next[/COLOR] j
    
    [COLOR="Navy"]End[/COLOR] [COLOR="Navy"]With[/COLOR]
    ts.Close
    [COLOR="Navy"]Set[/COLOR] ts = [COLOR="Navy"]Nothing[/COLOR]

My_Exit:
[COLOR="Navy"]If[/COLOR] [COLOR="Navy"]Not[/COLOR] ts [COLOR="Navy"]Is[/COLOR] [COLOR="Navy"]Nothing[/COLOR] [COLOR="Navy"]Then[/COLOR]
    ts.Close
    [COLOR="Navy"]Set[/COLOR] ts = [COLOR="Navy"]Nothing[/COLOR]
[COLOR="Navy"]End[/COLOR] [COLOR="Navy"]If[/COLOR]
[COLOR="Navy"]Set[/COLOR] FSO = [COLOR="Navy"]Nothing[/COLOR]
Sheet1.Range("B22").Value = ""
[COLOR="Navy"]Exit[/COLOR] [COLOR="Navy"]Sub[/COLOR]

ErrHandler:
MsgBox "Error: " & Err.Description
[COLOR="Navy"]Resume[/COLOR] My_Exit
[COLOR="Navy"]End[/COLOR] [COLOR="Navy"]Sub[/COLOR]
 
Last edited:
Upvote 0
Hmm... That sounds interesting and I think it might bring us close to the thing I ultimately need. Certainly a progress to the current solution anyway!

So here's the thing I need to do:

1) These "One Two Three" columns will partly decide whether the row is actually needed or not. User inputs the info which will be translated to for example
Code:
WHERE One IN (1, 2, 3) AND Two IN (4, 5, 6) AND Three IN (7, 8, 9, 10)
and then the rows that doesn't fit in the criteria are dropped out. If the number of the remaining rows (ie. the result of the WHERE-query) is more than 100k, I'll pick those rows randomly (this code is ready and it's actually posted in this forum too, but that is against an Excel workbook, not against a txt-file). At this point those first four columns are dropped out.

2) The sum I need is actually then not from that million rows, it's from those 100k- rows.

3) The reason why I need the sum is that I need to get TOP-40 of the columns (so yes, the already dropped 900k+ rows doesn't matter anymore) ordered by the sums. The remaining ~2360 columns are dropped out at this point. So this one works with SELECT (those 40 columns) FROM mytxtfile WHERE (that code posted above)

4) So now I have 40 columns times 100k- rows. This is the data I need to get to Excel. The program that processes this data is already completed.

Shortly taking the question right now is that when I have this big txt-file and your code posted above, what's the fastest optimization to this?
1) Just drive your code resulting 2360 needless sum values, pick top-40 of them all and continue from there
2) Make that criteria check to the code you posted (this is probably pretty easy to do) to result only those needed rows to be summed
3) Make an SQL-query (I have to check how to do this) against the text-file to drop out the needless rows and then run your code. I think this would be the fastest run time but hardest to implement.
4) Something else

But I have to tell you that you have been a great help already and this project has had several parts where without your help I would have been in big troubles.

Thanks!
 
Upvote 0
Jaymond

What do you actually need?

I'm honestly wondering why you want/need this data in Access.

It also seems that things are sort of all over the place - text files, Access, code, SQL, Excel...

One thing that i really can't understand is the use of Access, if you have this amount of data with this amount of fields I'm thinking perhaps isn't the best tool for the job.

Mind you I've no idea what the 'job' is.:)
 
Upvote 0

Forum statistics

Threads
1,224,612
Messages
6,179,890
Members
452,948
Latest member
Dupuhini

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top