Reduce processing time for splitting huge file into 25 smaller files

Factotum

Board Regular
Joined
May 14, 2015
Messages
118
I've got a massive file with as many as 70,000 rows of data and four pivot tables. The initial file will probably be around 35-40 MB. I need to split this report into 25 different files based on a department identifier. Given the size of the file, there will be considerable time waiting for Excel to keep up.

I've done several projects like this on a smaller scale, so my question is: what is the fastest method to use to keep processing time to a minimum? The options I have used in the past include:


  1. Filter to one department, copy visible rows, paste to an existing template, save and close template, repeat
  2. Open original file, Save As, filter out department to keep, delete all remaining rows, clear filter, save and close, repeat

Both options have their drawbacks - it will take forever to copy and paste that many rows, but it also takes a long time to reopen such a big file 25 times.

Does anyone have any suggestions for a quicker method for splitting up a report like this? Thanks for any ideas!
 
@offthelip - I'm more blown away by this than I was the first time I recorded a macro and saw a glimpse of the potential behind VBA. :pray: Thank you! This is amazing! Unfortunately, I'm stuck at one point. I actually need to loop through 25 different departments. I've got the loop figured out...sort of.

The first time through the loop works perfectly. But the second time, it leaves several thousand blank rows (where the first department rows were) and then puts in the second department underneath the blank rows. The third time leaves blank rows where the first two departments were and then puts the third department under the blank rows, and so on to the 25th loop. Here is the code I'm using - any suggestions on how to get each loop to transfer the data starting in cell A2 of the template?


Code:
Sub Performance_Measures()

'Define last row and first column of data as a reference point
Dim nRows As Integer
nRows = Range("A1048576").End(xlUp).Row

'Define variables needed for advanced filter
Dim rngFilter As Range, rngUnique As Range
Dim c As Range


'Set the Dept column as the one to be filtered
Set rngFilter = ActiveSheet.Range("AY1:AY" & nRows)


'Filter the Dept column to show only unique values
rngFilter.AdvancedFilter Action:=xlFilterInPlace, Unique:=True

'Set a variable to the Unique values
Set rngUnique = Range("AY2:AY" & nRows).SpecialCells(xlCellTypeVisible)


Dim outarr As Variant
With Worksheets("ProjectData")
LastRow = .Cells(Rows.Count, "A").End(xlUp).Row
lastcol = .Cells(1, .Columns.Count).End(xlToLeft).Column
inarr = Range(.Cells(1, 1), .Cells(LastRow, lastcol))
End With
indi = 1


For Each UniqueArea In rngUnique
ReDim outarr(1 To LastRow, 1 To lastcol)
    For i = 1 To LastRow
     If inarr(i, 51) = UniqueArea Then  ' column 51 is column AY  ' copy row to outarr
       For J = 1 To lastcol
        outarr(indi, J) = inarr(i, J)
       Next J
       indi = indi + 1
     End If
    Next i


    Workbooks.Open Filename:="C:\Users\TEMPLATE.xlsx"


    With Workbooks("TEMPLATE.xlsx").Worksheets("ProjectData")
    Range(.Cells(2, 1), .Cells(indi, lastcol)) = outarr
    End With
    
Set outarr = Nothing


'Save As and close

Next UniqueArea


End Sub
 
Upvote 0

Excel Facts

Round to nearest half hour?
Use =MROUND(A2,"0:30") to round to nearest half hour. Use =CEILING(A2,"0:30") to round to next half hour.
Fatso

Don't filter in place to get the unique items from column AY, filter and copy to another location, even to a blank temporary sheet.

Once you have the unique items put them into an array and then loop through that array to extract the data for each department to it's own array and place it in the appropriate sheet.
 
Upvote 0
Another option, if you're interested
Code:
Sub Splitdata()

   Dim Ws As Worksheet
   Dim Dic As Object
   Dim Cl As Range
   
   Set Dic = CreateObject("scripting.dictionary")
   Dic.comparemode = vbTextCompare
   Set Ws = ActiveSheet
Application.ScreenUpdating = False
   If Ws.AutoFilterMode Then Ws.AutoFilterMode = False
   For Each Cl In Ws.Range("A2", Ws.Range("A" & Rows.Count).End(xlUp))
      If Not Dic.exists(Cl.Value) Then
         Workbooks.Add (1)
         Dic.Add Cl.Value, Nothing
         Ws.Range("a1").AutoFilter 1, Cl.Value
         Ws.AutoFilter.Range.SpecialCells(xlVisible).Copy Range("A1")
         ActiveWorkbook.SaveAs "C:\MrExcel\Sales1\" & Cl.Value & ".xlsm", 52
         ActiveWorkbook.Close False
      End If
   Next Cl
   Ws.AutoFilterMode = False
End Sub
 
Upvote 0
Thanks for the suggestion Fluff. I was researching this late last night and couldn't decide whether to use an Array, Collection, or Dictionary - I'm still not sure the difference, but I'll get there eventually. I'll give it a try in the next day or two and report back.

Norie - your suggestions are also very much appreciated. I'll give it a go on yours as well, but I think the problem is with the Array and not with how I store the unique values to loop through. My code does succeed in looping through all the unique values, it's just a problem of where it dumps the rows into the new workbook.

Thanks again everyone! This forum has saved my skin too many times to count!
 
Upvote 0
Fatso

You are populating inarr when the data is filtered.

The following code will separate the data from the sheet 'ProjectData' into sheets for each unique value in column AY.

Obviously you'll want to change things to use your template but that shouldn't be too difficult.
Code:
Sub Performance_Measures()
Dim wsData As Worksheet
Dim wsNew As Worksheet
Dim wsTemp As Worksheet
Dim nRows As Integer
Dim rngFilter As Range
Dim arrUnique()
Dim inarr()
Dim headerarr()
Dim outarr()
Dim UniqueArea As Variant
Dim cnt As Long
Dim I As Long
Dim J As Long
Dim lastrow As Long
Dim lastcol As Long

    Set wsData = Sheets("ProjectData")

    With wsData
        lastrow = .Cells(Rows.Count, "A").End(xlUp).Row
        lastcol = .Cells(1, .Columns.Count).End(xlToLeft).Column
        inarr = .Range(.Cells(2, 1), .Cells(lastrow, lastcol)).Value
        headerarr = .Range(.Cells(1, 1), .Cells(1, lastcol)).Value
        Set rngFilter = .Range("AY1:AY" & lastrow)
    End With

    Set wsTemp = Sheets.Add

    With wsTemp
        rngFilter.AdvancedFilter Action:=xlFilterCopy, CopyToRange:=.Range("A1"), Unique:=True
        arrUnique = .Range("A2", .Range("A" & Rows.Count).End(xlUp)).Value
        Application.DisplayAlerts = False
        .Delete
        Application.DisplayAlerts = True
    End With

    For Each UniqueArea In arrUnique
        ReDim outarr(1 To lastrow, 1 To lastcol)
        For I = 1 To UBound(inarr, 1)
            If inarr(I, 51) = UniqueArea Then  ' column 51 is column AY  ' copy row to outarr
                cnt = cnt + 1
                For J = 1 To UBound(inarr, 2)
                    outarr(cnt, J) = inarr(I, J)
                Next J

            End If
        Next I

        outarr = Application.Transpose(outarr)
        ReDim Preserve outarr(1 To UBound(outarr, 1), 1 To cnt)

        If cnt > 0 Then
            Set wsNew = wsData.Parent.Sheets.Add(After:=wsData.Parent.Sheets(wsData.Parent.Sheets.Count))
            wsNew.Name = UniqueArea
            wsNew.Range("A1").Resize(, UBound(headerarr, 2)).Value = headerarr
            wsNew.Range("A2").Resize(UBound(outarr, 2), UBound(outarr, 1)).Value = Application.Transpose(outarr)
        End If

        Erase outarr

        cnt = 0

    Next UniqueArea

End Sub
 
Upvote 0

The first time through the loop works perfectly. But the second time, it leaves several thousand blank rows (where the first department rows were) and then puts in the second department underneath the blank rows. The third time leaves blank rows where the first two departments were and then puts the third department under the blank rows, and so on to the 25th loop. Here is the code I'm using - any suggestions on how to get each loop to transfer the data starting in cell A2 of the template?
all you need to do is reset the output array index at the end of every loop
put this line just before the Next Uniquearea
Code:
Indi=2
 
Upvote 0
@pvr928 - thank you so much for getting me started down this train of thought. I have much to learn, but I'm excited about it. To the rest of you, thank you so much for helping me with code that I still don't fully understand. I learn best by seeing something in action and then reverse engineering it - now I have three solid examples of working code to learn from and apply to future projects.

I was able to modify all three suggested macros to fit my project and they all work nicely. In case any of you are interested, here is how long each macro took to run, all other things being equal:

  • @offthelip - 0:50.15 (and thanks for showing me the timer - that's a neat little trick!)
  • @Fluff - 2:33.72
  • @Norie - 0:57.85

Regardless which one I end up using, this will be a massive improvement! You've just saved my team at least 10 hours of extremely monotonous work per month. Thank you again!
 
Upvote 0
I am delighted to see that I won the race!! you can't beat variant arrays for speed!!
Glad to be of help
 
Upvote 0

Forum statistics

Threads
1,223,098
Messages
6,170,106
Members
452,302
Latest member
TaMere

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top