Txt to csv or xls - Help !

chupertoto

New Member
Joined
Jun 27, 2010
Messages
3
Hi,

I'm working with super heavy txt files (50 000lines), all formated in the same way. I'm trying to bring them together in a csv or xls file. I normally do it by hand but the sheer size of the files makes Excel - Import - Convert prone to errors and crashes. The separator for the fields is the good old " ".
Please help !
 

Excel Facts

What is the last column in Excel?
Excel columns run from A to Z, AA to AZ, AAA to XFD. The last column is XFD.
I forgot the source of this..

All your .txt files should be in the same folder where you have your macro file and change the Sheet name from "Raw" according to your sheet.

Code:
Sub MergeTxtFiles()
    Sheets("Raw").Select
    Cells.Select
    Selection.Delete Shift:=xlUp
    Range("A1").Select
Dim z  As Long, e As Long, d As Long
Dim f As String
d = 2
Sheets("Raw").Select
Cells(1, 1) = "=cell(""filename"")"
Cells(1, 2) = "=left(A1,find(""["",A1)-1)"
Cells(2, 1).Select
f = Dir(Cells(1, 2) & "*.txt")
Do While Len(f) > 0
ActiveCell.Formula = f
ActiveCell.Offset(1, 0).Select
f = Dir()
Loop
z = Cells(Rows.Count, 1).End(xlUp).Row
For e = 2 To z
Cells(d, 2) = Cells(e, 1)
With ActiveSheet.QueryTables.Add(Connection:= _
        "TEXT;" & Sheets("Raw").Cells(1, 2) & Sheets("Raw").Cells(e, 1), _
        Destination:=Range("C" & d + 1))
        .Name = "deep"
        .FieldNames = True
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .TextFilePromptOnRefresh = False
        .TextFilePlatform = xlWindows
        .TextFileStartRow = 1
        .TextFileParseType = xlFixedWidth
        .TextFileTextQualifier = xlTextQualifierDoubleQuote
        .TextFileConsecutiveDelimiter = False
        .TextFileTabDelimiter = True
        .TextFileSemicolonDelimiter = False
        .TextFileCommaDelimiter = False
        .TextFileSpaceDelimiter = False
        .Refresh BackgroundQuery:=False
    End With
    d = Cells(Rows.Count, 3).End(xlUp).Row
    
Next e
MsgBox "Done"
End Sub

here are some other links..
Link1
Link2
Link3
 
Upvote 0
That was the simplest piece of code ever ! Mind = blown ! You guys rock !
Now let me add some difficulty.
My file has a few hundred thousand lines and a hundred columns (always formated and named in the same way). Is there a way to go from the now full merged file (in csv/txt) to another csv file with the data of a couple of selected colums only (let's say column "name" / "sex" / "location") ?
 
Upvote 0
That was the simplest piece of code ever ! Mind = blown ! You guys rock !
</p>If you know how to record a macro and make it dynamic then do it and add that code in the end of the above code else provide us a few lines from the txt files and tell us which columns you want to preserve..
 
Upvote 0
Here is how I would process a large file to get selected fields.

Open the Source and Target files. Note the Target file is opened for Append. This allows us to add records without overwriting existing entries. If the Target file does not exist it is created.
Code:
   [COLOR=darkblue]Open[/COLOR] sSource [COLOR=darkblue]For[/COLOR] [COLOR=darkblue]Input[/COLOR] [COLOR=darkblue]As[/COLOR] #1
   [COLOR=darkblue]Open[/COLOR] sTarget [COLOR=darkblue]For[/COLOR] [COLOR=darkblue]Append[/COLOR] [COLOR=darkblue]As[/COLOR] #2

We need to loop through each line in the source file and read it into an array variable. The split function separates the input string on the data deliminator.
Code:
   [COLOR=darkblue]Do[/COLOR] [COLOR=darkblue]While[/COLOR] [COLOR=darkblue]Not[/COLOR] EOF(1)
      Line [COLOR=darkblue]Input[/COLOR] #1, sRecord
      aRecord = Split(sRecord, sDelim)

Next we need to build up an output string. NB Note the Stop command in the code below. This will stop execution of the code.

If it is not visible click on View => LocalsWindow
Expand the variable aRecord
This will allow you to see the contents of the record array fields and will help as you build up the output string.
Comment out the stop command when you are happy with the output.
NB Notice the last line in the output string has the End Of Line character

Code:
      [COLOR=green]'build up the output string[/COLOR]
      [COLOR=red]Stop[/COLOR]
      sOutput = aRecord(0) & sDelim
      sOutput = sOutput & aRecord(1) & sDelim
      sOutput = sOutput & aRecord(4) & Chr(13)  [COLOR=green]'end of line character[/COLOR]

Remember arrays are base zero, i.e. Field(0) is the first entry
The above code snippet is building up a string of fileds 0,1 and 4 of the input array.
The output results are shown below

source

<TABLE style="BACKGROUND-COLOR: #ffffff; PADDING-LEFT: 2pt; PADDING-RIGHT: 2pt; FONT-FAMILY: Arial,Arial; FONT-SIZE: 10pt" border=1 cellSpacing=0 cellPadding=0><COLGROUP><COL style="WIDTH: 30px; FONT-WEIGHT: bold"><COL style="WIDTH: 76px"><COL style="WIDTH: 64px"><COL style="WIDTH: 64px"><COL style="WIDTH: 64px"><COL style="WIDTH: 43px"></COLGROUP><TBODY><TR style="TEXT-ALIGN: center; BACKGROUND-COLOR: #cacaca; FONT-SIZE: 8pt; FONT-WEIGHT: bold"><TD></TD><TD>A</TD><TD>B</TD><TD>C</TD><TD>D</TD><TD>E</TD></TR><TR style="HEIGHT: 17px"><TD style="TEXT-ALIGN: center; BACKGROUND-COLOR: #cacaca; FONT-SIZE: 8pt">1</TD><TD>John</TD><TD>Doe1</TD><TD>Male</TD><TD>Age</TD><TD style="TEXT-ALIGN: right">1</TD></TR><TR style="HEIGHT: 17px"><TD style="TEXT-ALIGN: center; BACKGROUND-COLOR: #cacaca; FONT-SIZE: 8pt">2</TD><TD>John</TD><TD>Doe2</TD><TD>Male</TD><TD>Age</TD><TD style="TEXT-ALIGN: right">2</TD></TR><TR style="HEIGHT: 17px"><TD style="TEXT-ALIGN: center; BACKGROUND-COLOR: #cacaca; FONT-SIZE: 8pt">3</TD><TD>John</TD><TD>Doe3</TD><TD>Male</TD><TD>Age</TD><TD style="TEXT-ALIGN: right">3</TD></TR><TR style="HEIGHT: 17px"><TD style="TEXT-ALIGN: center; BACKGROUND-COLOR: #cacaca; FONT-SIZE: 8pt">4</TD><TD>John</TD><TD>Doe4</TD><TD>Male</TD><TD>Age</TD><TD style="TEXT-ALIGN: right">4</TD></TR><TR style="HEIGHT: 17px"><TD style="TEXT-ALIGN: center; BACKGROUND-COLOR: #cacaca; FONT-SIZE: 8pt">5</TD><TD>John</TD><TD>Doe5</TD><TD>Male</TD><TD>Age</TD><TD style="TEXT-ALIGN: right">5</TD></TR></TBODY></TABLE>

target


<TABLE style="BACKGROUND-COLOR: #ffffff; PADDING-LEFT: 2pt; PADDING-RIGHT: 2pt; FONT-FAMILY: Arial,Arial; FONT-SIZE: 10pt" border=1 cellSpacing=0 cellPadding=0><COLGROUP><COL style="WIDTH: 30px; FONT-WEIGHT: bold"><COL style="WIDTH: 90px"><COL style="WIDTH: 64px"><COL style="WIDTH: 64px"></COLGROUP><TBODY><TR style="TEXT-ALIGN: center; BACKGROUND-COLOR: #cacaca; FONT-SIZE: 8pt; FONT-WEIGHT: bold"><TD></TD><TD>A</TD><TD>B</TD><TD>C</TD></TR><TR style="HEIGHT: 17px"><TD style="TEXT-ALIGN: center; BACKGROUND-COLOR: #cacaca; FONT-SIZE: 8pt">1</TD><TD>John</TD><TD>Doe1</TD><TD style="TEXT-ALIGN: right">1</TD></TR><TR style="HEIGHT: 17px"><TD style="TEXT-ALIGN: center; BACKGROUND-COLOR: #cacaca; FONT-SIZE: 8pt">2</TD><TD>John</TD><TD>Doe2</TD><TD style="TEXT-ALIGN: right">2</TD></TR><TR style="HEIGHT: 17px"><TD style="TEXT-ALIGN: center; BACKGROUND-COLOR: #cacaca; FONT-SIZE: 8pt">3</TD><TD>John</TD><TD>Doe3</TD><TD style="TEXT-ALIGN: right">3</TD></TR><TR style="HEIGHT: 17px"><TD style="TEXT-ALIGN: center; BACKGROUND-COLOR: #cacaca; FONT-SIZE: 8pt">4</TD><TD>John</TD><TD>Doe4</TD><TD style="TEXT-ALIGN: right">4</TD></TR><TR style="HEIGHT: 17px"><TD style="TEXT-ALIGN: center; BACKGROUND-COLOR: #cacaca; FONT-SIZE: 8pt">5</TD><TD>John</TD><TD>Doe5</TD><TD style="TEXT-ALIGN: right">5</TD></TR></TBODY></TABLE>


The full code is shown below.
Edit the paths to the files
Set your Deliminator
The Target file will be created by the code
Copy and paste the code to the ThisWorkbook Module
Set up sample file as above to test


Code:
[COLOR=darkblue]Sub[/COLOR] ReadWriteTextFiles()
    [COLOR=darkblue]Dim[/COLOR] sSource [COLOR=darkblue]As[/COLOR] [COLOR=darkblue]String[/COLOR]        [COLOR=green]'path to source file[/COLOR]
    [COLOR=darkblue]Dim[/COLOR] sTarget [COLOR=darkblue]As[/COLOR] [COLOR=darkblue]String[/COLOR]        [COLOR=green]'path to target file[/COLOR]
    [COLOR=darkblue]Dim[/COLOR] sRecord [COLOR=darkblue]As[/COLOR] [COLOR=darkblue]String[/COLOR]        [COLOR=green]'input string for each row/line of text[/COLOR]
    [COLOR=darkblue]Dim[/COLOR] aRecord() [COLOR=darkblue]As[/COLOR] [COLOR=darkblue]String[/COLOR]      [COLOR=green]'input row/line of text split into fields[/COLOR]
    [COLOR=darkblue]Dim[/COLOR] sDelim [COLOR=darkblue]As[/COLOR] [COLOR=darkblue]String[/COLOR]         [COLOR=green]'deliminator[/COLOR]
    [COLOR=darkblue]Dim[/COLOR] sOutput [COLOR=darkblue]As[/COLOR] [COLOR=darkblue]String[/COLOR]        'output string
 
    [COLOR=green]'in the event of error ensure open files are closed[/COLOR]
    [COLOR=darkblue]On[/COLOR] [COLOR=darkblue]Error[/COLOR] [COLOR=darkblue]GoTo[/COLOR] errHandler
 
    [COLOR=green]'==============================================[/COLOR]
    [COLOR=green]'EDIT SOURCE FILE, TARGET FILE and DELIMINATTOR[/COLOR]
    [COLOR=green]'==============================================[/COLOR]
[COLOR=red]  sSource = "c:\temp\source.txt"[/COLOR]
[COLOR=red]  sTarget = "c:\temp\target.txt"[/COLOR]
    [COLOR=red]sD[/COLOR][COLOR=red]elim = " "[/COLOR]                   [COLOR=green]'for tab delimited files: [/COLOR][COLOR=red]= vbTab[/COLOR]
 
   [COLOR=darkblue]Open[/COLOR] sSource [COLOR=darkblue]For[/COLOR] [COLOR=darkblue]Input[/COLOR] [COLOR=darkblue]As[/COLOR] #1
   [COLOR=darkblue]Open[/COLOR] sTarget [COLOR=darkblue]For[/COLOR] [COLOR=darkblue]Append[/COLOR] [COLOR=darkblue]As[/COLOR] #2
 
   [COLOR=darkblue]Do[/COLOR] [COLOR=darkblue]While[/COLOR] [COLOR=darkblue]Not[/COLOR] EOF(1)
      Line [COLOR=darkblue]Input[/COLOR] #1, sRecord
      aRecord = Split(sRecord, sDelim)
 
      [COLOR=green]'build up the output string[/COLOR]
      [COLOR=red]Stop[/COLOR]
      sOutput = aRecord(0) & sDelim
      sOutput = sOutput & aRecord(1) & sDelim
      sOutput = sOutput & aRecord(4) & Chr(13)  [COLOR=green]'end of line character[/COLOR]
 
      [COLOR=green]'===========[/COLOR]
      'Output
      [COLOR=green]'===========[/COLOR]
      [COLOR=darkblue]Print[/COLOR] #2, sOutput
   [COLOR=darkblue]Loop[/COLOR]
 
[COLOR=green]'close the source file[/COLOR]
errExit:
    [COLOR=darkblue]Close[/COLOR] #1
    [COLOR=darkblue]Close[/COLOR] #2
    [COLOR=darkblue]Exit[/COLOR] [COLOR=darkblue]Sub[/COLOR]
errHandler:
    MsgBox "Error occured, now closing source file"
    [COLOR=darkblue]Resume[/COLOR] errExit
[COLOR=darkblue]End[/COLOR] [COLOR=darkblue]Sub[/COLOR]
 
Last edited:
Upvote 0

Forum statistics

Threads
1,215,306
Messages
6,124,160
Members
449,146
Latest member
el_gazar

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top