Loop Referencing Sheet Gets Progressively Slower

SomeKindaGreekGod · Nov 28, 2011

I've had a very intriguing problem for a long time now, and am really hoping someone here can offer me some insight and hopefully a solution.

Here's the deal: I have a simple function that pulls in a single cell value, one per row, stores it in a property of a custom type, and then does a little more processing based on the value.

The custom type is GR, the property is CGID.

Code:

With Worksheets("Sheet1")
            GR.CGID = UCase$(Trim$(.Range("CGID")(rw).Text))
End With

It runs fine with no noticeable performance changes up to around 10,000 records, but beyond that, it gets progressively slower and slower and slower until, at about the 20,000 record mark, it's processing at about 1/10 the speed it started at! (100 recs/sec instead of 1000+)

For testing purposes, all the cell values are the same, and are a small string value.

I know that hitting the sheet from VBA is time-intensive itself, so of course it makes sense that if I had two calls to the sheet, it would take twice as long to run the function as if I have just one. But this issue where the processing speed decreases as time goes on leads me to think there's some kind of garbage collection or memory allocation problem that I hope can be solved.

So to all the other gurus out there, any ideas? Any help is much appreciated! I'm not looking for ways to speed up the code necessarily, I really just want it to go the same speed for all 20,000 records.

shg · Nov 28, 2011

Are you dynamically redimensioning arrays?

SomeKindaGreekGod · Nov 28, 2011

I perform one ReDim in another function that's called after this loop of code goes through all the records. So I'm not redimensioning as I'm looping or anything like that.

Here's the basic structure of the loop, which is really most of my HasDuplicateCGIDs function:

Code:

For rw = startRow To lastRow
    [COLOR=SeaGreen]'Display updated progress on status bar[/COLOR]
    If rw Mod 100 = 0 Then
        Application.StatusBar = "Checking row: " & rw
        DoEvents
    End If
    
    [COLOR=SeaGreen]'Load new CGID into user-defined type GR[/COLOR]
    With Worksheets(GRANT_SHEET)
        [COLOR=Blue]GR.CGID = UCase$(Trim$(.Range("CGID")(rw).Text))[/COLOR]
        
        [COLOR=SeaGreen]'Check if CGID is the same as the one before or after[/COLOR]
        If GR.CGID <> "" And (GR.CGID = Trim$(.Range("CGID")(rw - 1)) Or GR.CGID = Trim$(.Range("CGID")(rw + 1))) Then
                HasDuplicateCGIDs = True
                [COLOR=SeaGreen]'Color sheet cells appropriately[/COLOR]
                .Range("CGID")(rw).Interior.ColorIndex = G_DuplicateColorIndex
                .Cells(rw, 1).Interior.ColorIndex = G_ErrorRowColorIndex
        End If
    End With
Next rw

As I mentioned, I'm not really concerned with making it execute faster in terms of optimization, just want to figure out why the record throughput gets slower the more records that have been processed (e.g. records 0-10,000 done in 5 seconds, records 10,001-20,000 take 30 seconds).

shg · Nov 28, 2011

It's not obvious to me why the code is progressively slower.

There are several things I'd code differently for speed, though.

chirp · Nov 28, 2011

generally declaring a range to a variant array, and then looping through that is faster...

if you are assuming your data is sorted then you do not need to check both the next and previous rows, i would also experiment with setting a duplicate and error range and then coloring them all at once, as i assume but am not 100% sure this is faster

also i would generally use a dictionary data structure for this just for simplicity

chirp · Nov 28, 2011

ah sorry did not really read your post!

i reproduced what you are finding and ended up with something like this (Where the times are the times for the previous 1000 rows):

Code:

Row: 1000 Time: 0.257084280252457
Row: 2000 Time: 0.361119290231727
Row: 3000 Time: 0.504711213288829
Row: 4000 Time: 0.6463229265064
Row: 5000 Time: 0.808559607481584
Row: 6000 Time: 0.957949962932616
Row: 7000 Time: 1.105074737221
Row: 8000 Time: 1.27128727256786
Row: 9000 Time: 1.42888027674053
Row: 10000 Time: 1.58616968069691
Row: 11000 Time: 1.74296677368693
Row: 12000 Time: 1.88724139519036
Row: 13000 Time: 2.03809548425488
Row: 14000 Time: 2.23599612526596
Row: 15000 Time: 2.37832385115325
Row: 16000 Time: 2.57113851059694
Row: 17000 Time: 2.72125004709233
Row: 18000 Time: 2.91509221144952
Row: 19000 Time: 3.07861718453933
Row: 20000 Time: 3.22325979976449

which is what you described, posting this because i am interested as well and have not yet found a reason

chirp · Nov 28, 2011

Okay so for whatever reason it looks like the ".Text" method gets slower with increasing row number, especially when it is combined with Ucase and Trim

try using cstr(cell reference here) instead

i am not sure why this is, but this seems to be the problem

this leads to a much faster and constant time...my test result was:

Code:

Row: 1000 Time: 0.049277072888799
Row: 2000 Time: 4.63252313202247E-02
Row: 3000 Time: 0.043224697583355
Row: 4000 Time: 4.44812119239941E-02
Row: 5000 Time: 4.52326342929155E-02
Row: 6000 Time: 4.64406091487035E-02
Row: 7000 Time: 4.41752373008057E-02
Row: 8000 Time: 5.19126034341753E-02
Row: 9000 Time: 4.73301773890853E-02
Row: 10000 Time: 4.84032885869965E-02
Row: 11000 Time: 5.17947112675756E-02
Row: 12000 Time: 4.78661394445226E-02
Row: 13000 Time: 4.66817710548639E-02
Row: 14000 Time: 0.044233764288947
Row: 15000 Time: 4.32288181036711E-02
Row: 16000 Time: 4.54409010708332E-02
Row: 17000 Time: 4.40782278310508E-02
Row: 18000 Time: 4.34815038461238E-02
Row: 19000 Time: 4.32459292933345E-02
Row: 20000 Time: 0.045845910673961
Row: 21000 Time: 4.79116060305387E-02
Row: 22000 Time: 4.98144316952676E-02
Row: 23000 Time: 4.28315609460697E-02
Row: 24000 Time: 4.79032949078828E-02
Row: 25000 Time: 4.52167803887278E-02
Row: 26000 Time: 4.59004566073418E-02
Row: 27000 Time: 4.45919802878052E-02
Row: 28000 Time: 4.35316499788314E-02
Row: 29000 Time: 4.59932757075876E-02
Row: 30000 Time: 4.42426340887323E-02

chirp · Nov 28, 2011

additionally and sorry for posting so much i feel its probably poor etiquette or something like that but:

it is of some interest to note that for the line of code:

Code:

S = UCase$(Trim$(ActiveSheet.Cells(V(j), 1).Text))

where S is a string and V(j) is just an array of row numbers, the following times were observed for 100 loops:

Code:

Time/100 loops at row 1: 8.78652173560113E-03
Time/100 loops at row 10: 7.29869294445962E-03
Time/100 loops at row 100: 8.09495337307453E-03
Time/100 loops at row 1000: 1.71527387574315E-02
Time/100 loops at row 10000: 9.10504687344655E-02
Time/100 loops at row 20000: 0.137205985607579
Time/100 loops at row 30000: 0.169853640603833
Time/100 loops at row 40000: 0.170526281930506
Time/100 loops at row 50000: 0.167729837121442
Time/100 loops at row 60000: 0.16629299253691

which is all very exciting as it seems there is a dramatic increase and then stabilization.
(this is with column 1 empty)
the ucase and trim do not impact the time significantly either, so why does anyone use the .text syntax?

SomeKindaGreekGod · Nov 29, 2011

Thanks for all the great trials Chirp! Glad you were able to replicate.

First off, I unfortunately have found no way around using the .Text property because I need to be able to use that in case the user entered something with custom formatting (say that pads with zeroes). Since the default is .Value, that would sometimes result in a different value being stored, and my users are rarely savvy enough with Excel to know the difference between Text and Value.

That said, I don't need .Text for this loop here, but I do need it for a later one, so thanks for at least giving me the idea to speed up this loop.

So a couple questions:
First, how do you get such precise time intervals? Just subtract an earlier saved time from Now()?

Second, why use an array of row numbers when, since you'd have to increment j anyway, why not just use Cells(j, i)?

And finally, what did you mean earlier by saying that if it was sorted I wouldn't have to check the row before and the row after? I did sort it, which is the only reason why it makes any sense to check the rows before and after in order to find and highlight duplicates in a column.

Again, thanks for all your time! This is great stuff, I hope we can find a solution!

SomeKindaGreekGod · Nov 29, 2011

If anyone has any ideas for how I can either avoid using .Text or how to keep it from hogging memory (I assume), then I'd love to hear them! The requirement is just that some cells I need to get what the cell looks like rather than it's ultimate value.

Loop Referencing Sheet Gets Progressively Slower

SomeKindaGreekGod

New Member

Excel Facts

shg

MrExcel MVP

SomeKindaGreekGod

New Member

shg

MrExcel MVP

chirp

Active Member

chirp

Active Member

chirp

Active Member

chirp

Active Member

SomeKindaGreekGod

New Member

SomeKindaGreekGod

New Member

Similar threads

Forum statistics

Share this page

Loop Referencing Sheet Gets Progressively Slower

New Member

Excel Facts

MrExcel MVP

New Member

MrExcel MVP

Active Member

Active Member

Active Member

Active Member

New Member

New Member

Similar threads

Forum statistics

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock