Timestamp Continuity Macro Problems

rush2112

New Member
Joined
Jun 10, 2014
Messages
5
Hello all,

I am looking for some help either A) troubleshooting an existing macro or B) finding another (better?) method for solving the same problem.

Because my write-up has proven to be rather lengthy to explain, here is the "TLDR" version: I have a macro that is supposed to fill in gaps in a data set that has a time sequence. It has some strange bugs and isn't functioning properly. It is included at the bottom of the post. Any idea what is wrong? Should I be doing this a different way? Note: This problem has persisted between two computers, one using Windows 7 and Excel 2010, and another using Windows 7 and Excel 2013.

For those of you who are interested in the full explanation and everything I know so far, here goes:

I am working with large data sets (hundreds of thousands of lines+) that include a time stamp for each line. Each time stamp occurs after 1 second. Due to what I believe is hardware limitations with our data collectors, there is a missing value every so often (~90 line increments). To have a continuous time series I have been working with a macro that inserts a row for each missing time value and inserts the proper time stamp. Note: This problem has persisted between two computers, one using Windows 7 and Excel 2010, and another using Windows 7 and Excel 2013.


For example:


5/29/2014 18:19:1111111
5/29/2014 18:19:1222222
5/29/2014 18:19:1333333
5/29/2014 18:19:1444444
5/29/2014 18:19:1555555
5/29/2014 18:19:2166666
5/29/2014 18:19:2277777
5/29/2014 18:19:2388888
5/29/2014 18:19:2499999
5/29/2014 18:19:251010101010

<tbody>
</tbody>

<tbody>
</tbody>

Between line 5 and line 6 there are 5 missing data points (and therefore a discontinuous time series.) After running my macro, this same data looks like this:

5/29/2014 18:19:1111111
5/29/2014 18:19:1222222
5/29/2014 18:19:1333333
5/29/2014 18:19:1444444
5/29/2014 18:19:1555555
5/29/2014 18:19:1666666
5/29/2014 18:19:17
5/29/2014 18:19:18
5/29/2014 18:19:19
5/29/2014 18:19:20
5/29/2014 18:19:21
5/29/2014 18:19:2277777
5/29/2014 18:19:2388888
5/29/2014 18:19:2499999
5/29/2014 18:19:251010101010

<tbody>
</tbody>

<tbody>
</tbody>

This is working as intended.


Now here is the problem:
While running this macro on large data sets (100,000-750,000 lines) I encountered some large gaps in my data that this macro did NOT fill in with time stamps. To illustrate I've included an example below.

The example below shows gaps in time between the highlighted time stamps prior to running the macro to fill them in.

02/02/04 18:58:10
02/02/04 18:58:11
02/02/04 18:58:12
02/02/04 18:58:13
02/02/04 18:58:14
02/02/04 18:58:15
02/02/04 19:08:08
02/02/04 19:08:09
02/02/04 19:08:10
02/02/04 19:08:11
02/02/04 19:08:12
02/02/04 19:08:13
02/02/04 19:08:14
02/02/04 19:08:15

<tbody>
</tbody>
436
436
436
2
3
4
5
6
7
436
436
436
436
436

<tbody>
</tbody>

<tbody>
</tbody>

The example below shows some filled gaps in green but then there is a 7 minute gap between the last two red highlighted time stamps where the macro did not fill in missing time stamps.

02/02/04 18:58:11436
02/02/04 18:58:12436
02/02/04 18:58:132
02/02/04 18:58:143
02/02/04 18:58:154
02/02/04 18:58:165
02/02/04 18:58:17
02/02/04 18:58:18
02/02/04 18:58:19
02/02/04 18:58:20
02/02/04 18:58:21
02/02/04 18:58:22
02/02/04 18:58:23
02/02/04 18:58:24
02/02/04 18:58:25
02/02/04 18:58:26
02/02/04 18:58:27
02/02/04 18:58:28
02/02/04 18:58:29
02/02/04 18:58:30
02/02/04 18:58:31
02/02/04 18:58:32
02/02/04 18:58:33
02/02/04 18:58:34
02/02/04 18:58:35
02/02/04 18:58:36
02/02/04 18:58:37
02/02/04 18:58:38
02/02/04 18:58:39
02/02/04 18:58:40
02/02/04 18:58:41
02/02/04 18:58:42
02/02/04 18:58:43
02/02/04 18:58:44
02/02/04 18:58:45
02/02/04 18:58:46
02/02/04 18:58:47
02/02/04 18:58:48
02/02/04 18:58:49
02/02/04 18:58:50
02/02/04 18:58:51
02/02/04 18:58:52
02/02/04 18:58:53
02/02/04 18:58:54
02/02/04 18:58:55
02/02/04 18:58:56
02/02/04 18:58:57
02/02/04 18:58:58
02/02/04 18:58:59
02/02/04 18:59:00
02/02/04 18:59:01
02/02/04 18:59:02
02/02/04 18:59:03
02/02/04 18:59:04
02/02/04 18:59:05
02/02/04 18:59:06
02/02/04 18:59:07
02/02/04 18:59:08
02/02/04 18:59:096
02/02/04 19:08:107
02/02/04 19:08:11436
02/02/04 19:08:12436

<tbody>
</tbody>

<tbody>
</tbody>


Something interesting to note here: the time value 18:59:09 does not exist in the raw data, and so there should be no "6" next to it; the same goes for the time stamp 18:58:16, which should also not have a "5" next to it as it also doesn't exist in the raw data. These values next to the time stamps appear to be shifted from their original cells: "5" used to belong to 19:08:08 and "6" used to belong to 19:08:09. Also, 19:08:08 and 19:08:09 don't even exist in the data set after the macro has been run. So...we are quite lost at this point and any help would be greatly appreciated. I've included the macro below.




Sub addmissingseconds()
'This workbook has a very crude macro that inserts rows for missing seconds.
'The date and time must be in column A.
'The macro ignores the first 3 lines to allow for a header. We could make that two lines if better.
'The spreadsheet must have the date and time in column A.
'The macro will add the missing times in column A, and blank values in the other columns.
'With this macro open, you can open a spreadsheet of data and hit CTRL-SHIFT-I to insert the values.
'This spreadsheet should not be fully trusted. Check results and report back to Lonny.
Dim i As Long, x As Long
'step from end to the third row
For i = Range("a" & Rows.Count).End(xlUp).Row To 3 Step -1
x = Val(Format$(Cells(i, 1).Value - Cells(i - 1, 1).Value, "s"))
If x > 1 Then
Rows(i + 1).Resize(x - 1).Insert
With Cells(i, 1).Resize(x + 1)
.Formula = "=r[-1]c+""0:00:01"""
.Value = .Value
'this should add in the new row with just column A. Use Resize(, 3) for B and C
With .Offset(, 1).Resize(, 1)
'.Formula = "=r[-1]c"
'.Value = .Value
End With
End With
End If
Next
End Sub





Thank you very much for any help you might have to offer!
-Josiah
 

Excel Facts

Will the fill handle fill 1, 2, 3?
Yes! Type 1 in a cell. Hold down Ctrl while you drag the fill handle.
At a minimum, make the following changes, highlighted in red.
Rich (BB code):
Sub addmissingseconds()
    'This workbook has a very crude macro that inserts rows for missing seconds.
    'The date and time must be in column A.
    'The macro ignores the first 3 lines to allow for a header. We could make that two lines if better.
    'The spreadsheet must have the date and time in column A.
    'The macro will add the missing times in column A, and blank values in the other columns.
    'With this macro open, you can open a spreadsheet of data and hit CTRL-SHIFT-I to insert the values.
    'This spreadsheet should not be fully trusted. Check results and report back to Lonny.
    Dim i As Long, x As Long
    'step from end to the third row
    For i = Range("a" & Rows.Count).End(xlUp).Row To 3 Step -1
        x = 86400 * (Cells(i, 1).Value - Cells(i - 1, 1).Value)
        If x > 1 Then
            Rows(i).Resize(x - 1, 2).Insert Shift:=xlDown
            With Cells(i, 1).Resize(x + 1)
                .Formula = "=r[-1]c+""0:00:01"""
                .Value = .Value
            End With
        End If
    Next
End Sub

The major mistake was the use of Format(...,"s"). It returns the difference in seconds modulo 60 -- in your example, 53 instead of 593.

The problem with the alignment of data in column B was the fact that you inserted rows only in column A. Resize(...,2).Insert also inserts rows in column B. Then we need Shift:=xlDown to override the default.
 
Upvote 0
Errata....
The problem with the alignment of data in column B was the fact that you inserted rows only in column A. Resize(...,2).Insert also inserts rows in column B. Then we need Shift:=xlDown to override the default.

That is an incorrect description of the second problem. I had forgotten what changes I made to make it work.

At a minimum, make the following changes, highlighted in red.
Rich (BB code):
Sub addmissingseconds()
    'This workbook has a very crude macro that inserts rows for missing seconds.
    'The date and time must be in column A.
    'The macro ignores the first 3 lines to allow for a header. We could make that two lines if better.
    'The spreadsheet must have the date and time in column A.
    'The macro will add the missing times in column A, and blank values in the other columns.
    'With this macro open, you can open a spreadsheet of data and hit CTRL-SHIFT-I to insert the values.
    'This spreadsheet should not be fully trusted. Check results and report back to Lonny.
    Dim i As Long, x As Long
    'step from end to the third row
    For i = Range("a" & Rows.Count).End(xlUp).Row To 3 Step -1
        x = 86400 * (Cells(i, 1).Value - Cells(i - 1, 1).Value)
        If x > 1 Then
            Rows(i).Resize(x - 1).Insert
            With Cells(i, 1).Resize(x)
                .Formula = "=r[-1]c+""0:00:01"""
                .Value = .Value
            End With
        End If
    Next
End Sub

The major mistake was the use of Format(...,"s"). It returns the difference in seconds modulo 60 -- in your example, 53 instead of 593.

The misalignment was due to Rows(i+1). It moved 19:08:09 6 down instead of 19:08:08 5. Then it overwrote 19:08:08 with 18:58:16, 1 sec after 18:58:15 4. And it overwrote 19:08:09 with 18:59:09 due to the modulo 60 defect and Resize(x+1).
 
Upvote 0
Improvements....
At a minimum, make the following changes, highlighted in red.
Rich (BB code):
        x = 86400 * (Cells(i, 1).Value - Cells(i - 1, 1).Value)
        If x > 1 Then
            Rows(i).Resize(x - 1).Insert
            With Cells(i, 1).Resize(x)
                .Formula = "=r[-1]c+""0:00:01"""
                .Value = .Value
            End With
        End If

Although there is nothing wrong with Cells(i,1).Resize(x), Cells(i,1).Resize(x-1) is better.

In your example, x = 593. So you want to insert 592 rows, x-1. And you only need to enter time in the 592 new rows.

-----

BTW, I usually recommend that time calculations be explicitly rounded to the accuracy required -- to the second, in your case. That ensures that any dependent formulas, lookups in particular, behave as you expect.

The issue is: most non-integer arithmetic incur infinitesimal and often invisible differences that eventually become significant and sometimes visible. This is due to the fact that numbers are represented using 64-bit binary floating-point internally. Thus, for example, IF(10.1 - 10 = 0.1, TRUE) returns FALSE(!) because 10.1 - 10 does not have the same internal representation as the constant 0.1.

So you might consider the following modified code.
Rich (BB code):
Sub addmissingseconds()
    'This workbook has a very crude macro that inserts rows for missing seconds.
    'The date and time must be in column A.
    'The macro ignores the first 3 lines to allow for a header. We could make that two lines if better.
    'The spreadsheet must have the date and time in column A.
    'The macro will add the missing times in column A, and blank values in the other columns.
    'With this macro open, you can open a spreadsheet of data and hit CTRL-SHIFT-I to insert the values.
    'This spreadsheet should not be fully trusted. Check results and report back to Lonny.
    Dim i As Long, x As Long
    'step from end to the third row
    For i = Range("a" & Rows.Count).End(xlUp).Row To 3 Step -1
        x = 86400 * (Cells(i, 1).Value - Cells(i - 1, 1).Value)
        If x > 1 Then
            Rows(i).Resize(x - 1).Insert
            With Cells(i, 1).Resize(x - 1)
                .Formula = "=ROUND(r[-1]c*86400+1,0)/86400"
                .Value = .Value
            End With
        End If
    Next
End Sub
 
Upvote 0
Joeu2004, thank you so much. I appreciate your thorough response. We made the changes you recommended and it is working perfectly now. You rock!
 
Upvote 0
Hello again,

I've run into a problem while using this macro. It has been behaving as expected on small data sets (~3500 lines) but when trying to run it on one of our large data sets (more than 750,000 lines) I am encountering the following error:


Run-timeerror6Overflow_zpsaf74c6e7.png



We tried this on two different computers and monitored our CPU's performance and RAM, and neither was particularly taxed while running this macro, so we assume it must be something on the Excel side. Any ideas?




Thanks so much,
-Josiah
 
Last edited:
Upvote 0
I've run into a problem while using this macro. It has been behaving as expected on small data sets (~3500 lines) but when trying to run it on one of our large data sets (more than 750,000 lines) I am encountering the following error:
[....image of Runtime Error 6: Overflow elided....]

Sorry I did not see this follow-up posting sooner.

I cannot duplicate your problem. I tried 800,000 lines of data with 248,574 gaps of 1 sec to create a total of 1,048,575 lines.

Be sure that you did not use type Integer instead of type Long for some variables you might have added to my original implementation.

If you continue to have a problem, at a minimum, I need to see your VBA code and an indication of what line was executing when the Overflow error occurs. You should see that if you click on Debug in the error dialog box.

Then you can press F5 to get the error dialog box again, and click on End to abort the macro.

Ideally, upload an example Excel file to a file-sharing website, and post the "shared" URL of the file in a response here. Be sure the file contains the original data, not the data modified by the macro.

-----

Initially, I thought perhaps the range of time in your data exceeded 1,048,574 sec. That would cause the macro to try to insert rows beyond row 1,048,576, the last available row in Excel 2007 and later. But that results in a different error in Excel 2010.

Nevertheless, to guard against that mistake, I suggest that you use the implementation below.

As a side-benefit, on my ancient computer, it reduces the time to process 40,000 lines of data with 12,500 1-sec gaps from 40 sec to 0.5 sec (!). It takes less than 8 sec to process 800,000 lines of data; the number and size of gaps do not make much difference.

The implementation does consume significantly more memory -- up to about 16 Mbytes more, I believe. But that is "nothing" compared to the amount of memory Excel consumes for such large workbooks. And most modern computers have 1+ Gbyte to spare.

Note: The expression Cells(maxRows, "a").End(xlUp).Row does not work as intended if your original data is composed of 1,048,575 lines. If that is a possibility, let me know, and I will fix that.

Also note that all implementations of the macro, including your own, assume the original data starts in A2. If that is not the case, tell me what the macro should expect.

Code:
Option Explicit

Sub addMissingSeconds()
    'This workbook has a very crude macro that inserts rows for missing seconds.
    'The date and time must be in column A.
    'The macro ignores the first 3 lines to allow for a header. We could make that two lines if better.
    'The spreadsheet must have the date and time in column A.
    'The macro will add the missing times in column A, and blank values in the other columns.
    'With this macro open, you can open a spreadsheet of data and hit CTRL-SHIFT-I to insert the values.
    'This spreadsheet should not be fully trusted. Check results and report back to Lonny.
    
    Dim i As Long, r As Long, x As Long, t As Double
    Dim maxRows As Long, lastRow As Long, nRows As Double
    Dim v As Variant, nV As Long
    Dim st As Single, et As Single
    
    ' copy original data into VBA array
    st = Timer
    maxRows = Rows.Count
    lastRow = Cells(maxRows, "a").End(xlUp).Row
    v = Range("a2", Cells(lastRow, "b"))
    nV = UBound(v, 1)
    nRows = WorksheetFunction.Round(86400 * (v(nV, 1) - v(1, 1)) + 1, 0)
    If nRows > maxRows - 1 Then
        MsgBox "too many rows needed:" & _
            vbNewLine & Format(nRows, "#,##0") & " > " & Format(maxRows - 1, "#,##0")
        Exit Sub
    End If
    
    ' generate modified data in VBA array
    ReDim res(2 To nRows + 1, 1 To 2)
    res(2, 1) = v(1, 1)
    res(2, 2) = v(1, 2)
    r = 2
    For i = 2 To nV
        r = r + 1
        x = 86400 * (v(i, 1) - v(i - 1, 1))
        If x > 1 Then
            t = WorksheetFunction.Round(86400 * v(i - 1, 1), 0)
            For r = r To r + x - 2
                t = t + 1
                res(r, 1) = t / 86400
            Next    ' r = r+x after for-loop
        End If
        res(r, 1) = v(i, 1)
        res(r, 2) = v(i, 2)
    Next
    
    ' overwrite original data with modified data in worksheet
    Range("a2:a" & r).NumberFormat = Range("a2").NumberFormat
    Range("b2:b" & r).NumberFormat = Range("b2").NumberFormat
    Range("a2:b" & r).Value2 = res
    et = Timer
    MsgBox Format(CDbl(et - st), "0.000000") & " sec"
End Sub
 
Last edited:
Upvote 0
joeu2004,

Thank you for your reply; sorry that mine is coming after so long.

I have tried to replicate the error that I posted about earlier, but have not been able to reproduce it since that day. I had not tried restarting my computer at the time that I wrote that reply as far as I know, so perhaps restarting corrected that error.

The macro that I have been running is exactly as you had written it; I copied and pasted and tested it for functionality and accuracy and it looked great and so I continued using it until I ran into the error that I’ve covered above.

On another computer (running Excel 2013, Windows 7) the macro is buggy past a certain number of lines; we have been working around this by splitting up the CSVs and running the macro on smaller files and then concatenating them back together. I feel like we are pushing the limits of what Excel is capable of, and soon we will definitely be exceeding its capability to handle large data sets; every day is 86,400 seconds and these will be running for months at a time between data collection, so we will end up with some massive data files that exceed the number of rows Excel can handle. Are you familiar with any other programs (or languages) that might be better suited to this?
In the meantime, Excel is an excellent tool for analyzing smaller data sets (no issues under half a million rows) and we will continue using it extensively, so this time series macro we have been working on is still very relevant.

I tested your latest iteration and it works very fast (3.8 seconds for ~600,000 lines) which is great, but it is not handling more than two columns (time stamp + one adjacent.) Our data has 11 columns in addition to the time stamp. I’m including a link to a representative data set so that you have a better idea of what we are working with:

https://www.dropbox.com/s/ep3pyfvfnpiyko9/35ca0 testing macro.zip

Please note that there are gaps in data that are larger than 1 second, and that sometimes the on-board clock (RTC) starts with the wrong time (which is corrected once it connects to the internet.) This means that sometimes we will have some data at the start that reads 1-4-2000 and then there will be a jump to a current time. If the macro is set to insert all of the seconds between 2000 and now it will exceed Excel’s capacity no matter the data set size. Perhaps this macro could be set to work on a set of highlighted data points so that it isn’t trying to fill in those 14 year gaps, or perhaps the macro issues a warning and stops if the difference in adjacent time stamps is greater than 12 hours? I am not exactly sure how to implement this.

Again, thank you so much for your contributions!

P.s. Here is the exact code we are using, just to eliminate that variable.

Code:
Sub idontwanttomissasecondwithyou()    'This workbook has a very crude macro that inserts rows for missing seconds.
    'The date and time must be in column A.
    'The macro ignores the first 3 lines to allow for a header. We could make that two lines if better.
    'The spreadsheet must have the date and time in column A.
    'The macro will add the missing times in column A, and blank values in the other columns.
    'With this macro open, you can open a spreadsheet of data and hit CTRL-SHIFT-I to insert the values.
    'This spreadsheet should not be fully trusted. Check results and report back to Lonny.
    Dim i As Long, x As Long
    'step from end to the third row
    For i = Range("a" & Rows.Count).End(xlUp).Row To 3 Step -1
        x = 86400 * (Cells(i, 1).Value - Cells(i - 1, 1).Value)
        If x > 1 Then
            Rows(i).Resize(x - 1).Insert
            With Cells(i, 1).Resize(x - 1)
                .Formula = "=ROUND(r[-1]c*86400+1,0)/86400"
                .Value = .Value
            End With
        End If
    Next
End Sub
 
Upvote 0

Forum statistics

Threads
1,214,874
Messages
6,122,034
Members
449,061
Latest member
TheRealJoaquin

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top