Help!! Need macro to get data from 100000+ row CSV file

Sebastian K.

New Member
Joined
Jul 19, 2011
Messages
16
Hi

I've got a massive CSV data file, containing data loggings from a process.

I need to import specific data ranges to a worksheet.
What I want to do is average x rows before importing it to my worksheet.

My VBA skills are not any where close to the level I need to write the code for such macro.:confused:
I hope some one in here can help me.

The data looks like this:
"Analog-Real._20_BF01";"28.06.2011 14:22:59";0;1;40722599293,0787
"Analog-Real._20_BF01";"28.06.2011 14:23:04";5,115741;1;40722599351,5394
"Analog-Real._20_BF01";"28.06.2011 14:23:09";5,05787;1;40722599410,0926
"Analog-Real._20_BF01";"28.06.2011 14:23:14";5,054977;1;40722599468,5417

The reason I want to average the data is that the logging sequence is 5sec. and to avoid "flooding" my worksheet, I need to reduce it to e.g. 1 minute intervals.

It's only the data from column 2 + 3 (time stamp + process value) that is of intrest.

Best regards
Sebastian
 
Last edited:

Excel Facts

What do {} around a formula in the formula bar mean?
{Formula} means the formula was entered using Ctrl+Shift+Enter signifying an old-style array formula.
That shouldn't be too hard. First of all, what is x rows. Is this a set number or will it change. Or do you want it based on the time stamp for a 1 minute interval? Next, what do you want to average? Column 3's numbers for the x rows? Can you give an example of the results you want?
 
Upvote 0
That shouldn't be too hard. First of all, what is x rows. Is this a set number or will it change. Or do you want it based on the time stamp for a 1 minute interval? Next, what do you want to average? Column 3's numbers for the x rows? Can you give an example of the results you want?

X jus represents the number of rows I want to average. I would like it to be represented by a variable so that I easy can change it.
If I want 1 min. interval and I've got a 5 sec. interval then I need to average 12 rows, thus x=12

I need to get data within a certin time span, which needs to be determined by user input by either cells in my worksheet or by input from a msgbox.
Then I need to average this range to reduce the amount of data. It needs be reduced from 5 sec. intervals to e.g. 1 min. intervals, this is where the average and x rows come into the picture

Best regards
Sebastian
 
Upvote 0
Okay, so you will pick the number of rows. What do you want the resulting data to look like? Which time stamp do you want displayed with the averaged data, and what do you want averaged? Do you just want the time stamp and the third piece of data? Can you give me an example of a before and after with some of your data so I can see what you want?
 
Upvote 0
Okay, try this. Put it in a module, and change the first 2 variables to fit what you have as far as the number of lines to be averaged, and the file location. I also noticed that your above data has some of the lines delimited with a semicolon and some with a comma. This will need to be consistent to work correctly.

Code:
Sub ReadData()
    'change to how many lines you want to average
    LinestoAverage = 10
    'change to the name of the CSV file
    NameOfFile = "C:\Documents and Settings\CLIENT\Desktop\Test.txt"
    i = FreeFile
    'headers
    With ThisWorkbook.Sheets("Sheet1")
        .Cells(1, 1) = "Time Stamp"
        .Cells(1, 2) = "Average"
        .Cells(1, 3) = "Data Pieces"
    End With
    SheetRow = 2
    Open NameOfFile For Binary Access Read As #i
        If LOF(i) > 0 Then
            While Not EOF(i)
                Data3Total = 0
                LineCount = 0
                TimeStamp = ""
                Do
                    Line Input #i, SingleLine
                    x = Split(SingleLine, ";", , vbTextCompare)
                    Data3Total = Data3Total + Val(x(2))
                    'save first time stamp of set to put into sheet
                    If TimeStamp = "" Then TimeStamp = x(1)
                    LineCount = LineCount + 1
                Loop Until EOF(i) Or LineCount = LinestoAverage
                With ThisWorkbook.Sheets("Sheet1")
                    Data3Avg = Data3Total / LineCount
                    'time stamp of first data in group
                    .Cells(SheetRow, 1) = TimeStamp
                    'average of data #3
                    .Cells(SheetRow, 2) = Data3Avg
                    'number of pieces averaged
                    .Cells(SheetRow, 3) = LineCount
                    SheetRow = SheetRow + 1
                End With
            Wend
        End If
    Close #i
End Sub
 
Last edited:
Upvote 0
Okay, try this. Put it in a module, and change the first 2 variables to fit what you have as far as the number of lines to be averaged, and the file location. I also noticed that your above data has some of the lines delimited with a semicolon and some with a comma. This will need to be consistent to work correctly.

Thanks .... I'll try it.
I'll be offline the next couple of weeks, but I'll post the results when I'm back
 
Upvote 0
Thanks .... I'll try it.
I'll be offline the next couple of weeks, but I'll post the results when I'm back


Ok .... now I'm back.

I've got the code working and imported some data for processing .... with succes

Hmm ..... must be possible .... :hammer: .... maby ... :huh: .... ahaa .... :beerchug:

and a little while later we had a son ...... and his name was GET ME THAT DATA


Code:
Dim DataTotal As Single
Dim Col_var, TimeReq, LnAvrg, SheetRow, ColRef, RowRef, SheetRef, x_DataVar, LnSkip As Integer
Dim File_1, File_2, File_3, File_4, File_5, File_6, File_7, File_8, File_9, File_10 As String
Private Sub Cmd_1_Click()
    File_1 = Application _
        .GetOpenFilename("Text Files (*.txt;*.csv), *.txt;*.csv")
    Txt_1.Text = File_1
End Sub
Private Sub Cmd_10_Click()
    File_10 = Application _
        .GetOpenFilename("Text Files (*.txt;*.csv), *.txt;*.csv")
    Txt_10.Text = File_10
End Sub
Private Sub Cmd_2_Click()
    File_2 = Application _
        .GetOpenFilename("Text Files (*.txt;*.csv), *.txt;*.csv")
    Txt_2.Text = File_2
 
End Sub
Private Sub Cmd_3_Click()
    File_3 = Application _
        .GetOpenFilename("Text Files (*.txt;*.csv), *.txt;*.csv")
    Txt_3.Text = File_3
 
End Sub
Private Sub Cmd_4_Click()
    File_4 = Application _
        .GetOpenFilename("Text Files (*.txt;*.csv), *.txt;*.csv")
    Txt_4.Text = File_4
End Sub
Private Sub Cmd_5_Click()
    File_5 = Application _
        .GetOpenFilename("Text Files (*.txt;*.csv), *.txt;*.csv")
    Txt_5.Text = File_5
End Sub
Private Sub Cmd_6_Click()
    File_6 = Application _
        .GetOpenFilename("Text Files (*.txt;*.csv), *.txt;*.csv")
    Txt_6.Text = File_6
End Sub
Private Sub Cmd_7_Click()
    File_7 = Application _
        .GetOpenFilename("Text Files (*.txt;*.csv), *.txt;*.csv")
    Txt_7.Text = File_7
End Sub
Private Sub Cmd_8_Click()
    File_8 = Application _
        .GetOpenFilename("Text Files (*.txt;*.csv), *.txt;*.csv")
    Txt_8.Text = File_8
End Sub
Private Sub Cmd_9_Click()
    File_9 = Application _
        .GetOpenFilename("Text Files (*.txt;*.csv), *.txt;*.csv")
    Txt_9.Text = File_9
End Sub
Private Sub Cmd_cancel_Click()
    Unload Me
End Sub
Private Sub Cmd_ok_Click()
LnAvrg = CInt(Frm_1.Txt_LnNum_1.Value)  ' Number of lines to average
x_Dlim = Frm_1.Txt_Dlim.Value           ' The delimiter character
LnSkip = Frm_1.Txt_LnSkip               ' Number of lines to skip at the beginning
'Processing the first file
If Not Frm_1.Txt_1.Value = "" Then
    FileToOpen = Frm_1.Txt_1.Value
    SheetRow = RowRef
    SheetCol = ColRef
    x_DataVar = 1
    'Now we get the time for the data set
    GoSub GetData
    SheetRow = RowRef
    SheetCol = ColRef + 1
    x_DataVar = 2
    'Now we get the data set from the first file
    GoSub GetData
End If
'Processing file 2
If Not Frm_1.Txt_2.Value = "" Then
    FileToOpen = Frm_1.Txt_2.Value
    SheetRow = RowRef
    SheetCol = ColRef + 2
    x_DataVar = 2
    'Now we get the data set from file 2
    GoSub GetData
End If
'Processing file 3
If Not Frm_1.Txt_3.Value = "" Then
    FileToOpen = Frm_1.Txt_3.Value
    SheetRow = RowRef
    SheetCol = ColRef + 3
    x_DataVar = 2
    'Now we get the data set from file 3
    GoSub GetData
End If
'Processing file 4
If Not Frm_1.Txt_4.Value = "" Then
    FileToOpen = Frm_1.Txt_4.Value
    SheetRow = RowRef
    SheetCol = ColRef + 4
    x_DataVar = 2
    'Now we get the data set from file 4
    GoSub GetData
End If
'Processing file 5
If Not Frm_1.Txt_5.Value = "" Then
    FileToOpen = Frm_1.Txt_5.Value
    SheetRow = RowRef
    SheetCol = ColRef + 5
    x_DataVar = 2
    'Now we get the data set from file 5
    GoSub GetData
End If
'Processing file 6
If Not Frm_1.Txt_6.Value = "" Then
    FileToOpen = Frm_1.Txt_6.Value
    SheetRow = RowRef
    SheetCol = ColRef + 6
    x_DataVar = 2
    'Now we get the data set from file 6
    GoSub GetData
End If
'Processing file 7
If Not Frm_1.Txt_7.Value = "" Then
    FileToOpen = Frm_1.Txt_7.Value
    SheetRow = RowRef
    SheetCol = ColRef + 7
    x_DataVar = 2
    'Now we get the data set from file 7
    GoSub GetData
End If
'Processing file 8
If Not Frm_1.Txt_8.Value = "" Then
    FileToOpen = Frm_1.Txt_8.Value
    SheetRow = RowRef
    SheetCol = ColRef + 8
    x_DataVar = 2
    'Now we get the data set from file 8
    GoSub GetData
End If
'Processing file 9
If Not Frm_1.Txt_9.Value = "" Then
    FileToOpen = Frm_1.Txt_9.Value
    SheetRow = RowRef
    SheetCol = ColRef + 9
    x_DataVar = 2
    'Now we get the data set from file 9
    GoSub GetData
End If
'Processing file 10
If Not Frm_1.Txt_10.Value = "" Then
    FileToOpen = Frm_1.Txt_10.Value
    SheetRow = RowRef
    SheetCol = ColRef + 10
    x_DataVar = 2
    'Now we get the data set from file 10
    GoSub GetData
End If
Exit Sub
GetData:
    i = FreeFile
    Open FileToOpen For Binary Access Read As #i
        If LOF(i) > 0 Then
            'This loop skips a user determin number of headders
            If LnSkip > 0 Then
                LineCount = 0
                Do
                    Line Input #1, LnSpool
                    LineCount = LineCount + 1
                Loop Until EOF(i) Or LineCount = LnSkip
            End If
            While Not EOF(i)
                DataTotal = 0
                LineCount = 0
                Do
                    Line Input #i, SingleLine
                    x = Split(SingleLine, x_Dlim, , vbTextCompare)
 
                    ' This nested If statements determin if it is time or data to be used
                    ' When it is time, only the time from the first line is used
                    ' When it is data it summarises the data value for the defined lines to be averaged
                    If x_DataVar = 1 Then
                        If LineCount = 0 Then DataTime = x(x_DataVar)  'This aquires the time stamp for the data set
                    ElseIf x_DataVar > 1 Then DataTotal = DataTotal + CSng(x(x_DataVar))    ' The Csng function is used to ensure the preservation of the decimal number, in this case as single
                    End If
                    LineCount = LineCount + 1
                Loop Until EOF(i) Or LineCount = LnAvrg
 
                ' This With statement writs the read data to the spreadsheet
                ' using the sheet, column and row reference obtained
                With ThisWorkbook.Sheets(SheetRef)
 
                    ' This If statement determins if it is time or data before writing it to the cells accordingly
                    If x_DataVar > 1 Then DataAvg = DataTotal / LineCount Else: DataAvg = DataTime
                    .Cells(SheetRow, SheetCol) = DataAvg
                    SheetRow = SheetRow + 1
                End With
            Wend
        End If
    Close #i
Return
End Sub
Private Sub Cmd_ref_select_Click()
Dim InputCell As Excel.Range
On Error Resume Next
Frm_1.Hide
' Select the reference cell by using the inputbox method
Set InputCell = _
    Application.InputBox(Prompt:="Select first input cell", _
    Title:="Cell reference", Type:=8)
RowRef = InputCell.Row              ' Row reference number for use when writing to the spreadsheet
ColRef = InputCell.Column           ' Column reference number for use when writing to the spreadsheet
SheetRef = InputCell.Parent.Name    ' Sheet name for writing to the spreadsheet
' Fill the textbox with the address of the selected reference cell
Frm_1.Txt_CellRef.Text = InputCell.Parent.Name & "!" & InputCell.Address
Frm_1.Show
End Sub

I've changed the code from just beeing a macro I could call to be a userform where I can select 10 different files to get data from.
I can input the number of lines to average, enter the column number of the data in the csv file and enter the delimiter character.
I can also enter a number of lines to skip at the beginning of the file, this is to skip un wanted head lines in the files.
And finally I can select the sheet and cell reference where to put the imported data.

The userform looks like this:
ScreenHunter_01%252520Aug.%25252009%25252022.36.gif


Any comments and suggestions for improvements will be highly appreciated.

One thing I've not touched is error handeling, this is a subject I don't know enough about.

Best regards
Sebastian
 
Upvote 0
Looks good. I cleaned it up a little bit for you. This code should all go into the form code section. I simplified your file looping (this made your code a little shorter), and put the file handling into a true sub rather than a goto, return. I didn't make your whole form to test this, so no guarantees, but have a look. I also added a file exists check for some error handling. I know there could be more depending on how much someone is going to try and break this.

Code:
Option Explicit
Dim DataTotal As Single, i As Integer, j As Integer, fs As Object
Dim lnAvrg As Integer, x_dlim As String * 1
Dim LnSkip As Integer, DateSet As Boolean, FileToOpen As String
Dim SheetRef As String, SheetRow As Integer, SheetCol As Integer

Private Sub Cmd_ok_Click()
    lnAvrg = CInt(frm_1.txt_lnNum_1.Value)  ' Number of lines to average
    x_dlim = Left(frm_1.txt_dlim.Value, 1)          ' The delimiter character
    LnSkip = frm_1.txt_lnskip               ' Number of lines to skip at the beginning
    
    DateSet = False
    For j = 1 To 10
        If Not frm_1.Controls("Txt_" & j).Value = "" Then
            FileToOpen = frm_1.Controls("Txt_" & j).Value
            Set fs = CreateObject("Scripting.FileSystemObject")
            If fs.FileExists(FileToOpen) Then
                GetData
                SheetCol = SheetCol + 1
            Else
                MsgBox "File: '" & FileToOpen & "' not found.", vbOKOnly & vbExclamation
            End If
        End If
    Next j
End Sub

Sub GetData()
    Dim LineCount As Integer, LnSpool As String, SingleLine As String, x() As String
    i = FreeFile
    Open FileToOpen For Binary Access Read As #i
        If LOF(i) > 0 Then
            'This loop skips a user determin number of headers
            If LnSkip > 0 Then
                LineCount = 0
                Do
                    Line Input #1, LnSpool
                    LineCount = LineCount + 1
                Loop Until EOF(i) Or LineCount = LnSkip
            End If
            While Not EOF(i)
                DataTotal = 0
                LineCount = 0
                Do
                    Line Input #i, SingleLine
                    x = Split(SingleLine, x_dlim, , vbTextCompare)
                ' Write the date from the first line to the spreadsheet
                    If Not DateSet Then
                        ThisWorkbook.Sheets(SheetRef).Cells(SheetRow, SheetCol) = x(1)
                        SheetRow = SheetRow + 1
                        DateSet = True
                    End If
                    ' Summarize the data value for the defined lines to be averaged
                    DataTotal = DataTotal + CSng(x(2))    ' The Csng function is used to ensure the preservation of the decimal number, in this case as single
                    LineCount = LineCount + 1
                Loop Until EOF(i) Or LineCount = lnAvrg
 
                ' Write the read data to the spreadsheet
                ' using the sheet, column and row reference obtained
                ThisWorkbook.Sheets(SheetRef).Cells(SheetRow, SheetCol) = DataTotal / LineCount
                SheetRow = SheetRow + 1
            Wend
        End If
    Close #i
End Sub

Private Sub Cmd_ref_select_Click()
    Dim InputCell As Excel.Range
    On Error Resume Next
    frm_1.Hide
    ' Select the reference cell by using the inputbox method
    Set InputCell = _
        Application.InputBox(Prompt:="Select first input cell", _
        Title:="Cell reference", Type:=8)
    SheetRow = InputCell.Row             ' Row reference number for use when writing to the spreadsheet
    SheetCol = InputCell.Column          ' Column reference number for use when writing to the spreadsheet
    SheetRef = InputCell.Parent.Name    ' Sheet name for writing to the spreadsheet
    ' Fill the textbox with the address of the selected reference cell
    frm_1.Txt_CellRef.Text = InputCell.Parent.Name & "!" & InputCell.Address
    frm_1.Show
End Sub
 
Upvote 0
Looks good. I cleaned it up a little bit for you. This code should all go into the form code section. I simplified your file looping (this made your code a little shorter), and put the file handling into a true sub rather than a goto, return. I didn't make your whole form to test this, so no guarantees, but have a look. I also added a file exists check for some error handling. I know there could be more depending on how much someone is going to try and break this.

The code looks nice, I'l give it a try later today.

I'm working on an other small feature, I can't get working.
I've created an other form with a textbox, which I want to use to make a preview of some of the first line of file no 1.

I can concatenate the number of lines into one string and putting it into the textbox, but I can't get it to feed a new line for each line read from the file.

Here is the code I've tried to make work

Code:
Private Sub Cmd_preview_Click()
    i = FreeFile
    
    Open File_1 For Binary Access Read As #i
        If LOF(i) > 0 Then
            DataPreview = ""
            LineCount = 0
            Do
                Line Input #i, PreviewLine
                DataPreview = DataPreview & PreviewLine & Chr(13)
                LineCount = LineCount + 1
            Loop Until EOF(i) Or LineCount = 10
        End If
    Close #i
    
    Frm_2.Txt_preview.Text = DataPreview
    
    Frm_2.Show
    
End Sub

I wate to use this to se the beginning of the file to determin the number of head lines, the delimiter character and subsequent column no of the data.

Best regards
Sebastian
 
Upvote 0
You need to add a carriage return and a line feed between each set of data. It is chr(13) & chr(10). Or you can just use the built in constant vbCrLf.

Code:
DataPreview = DataPreview & PreviewLine & vbCrLf
 
Upvote 0

Forum statistics

Threads
1,224,587
Messages
6,179,733
Members
452,939
Latest member
WCrawford

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top