SUMIFS Coding challenge has been issued!

johnnyL

Well-known Member
Joined
Nov 7, 2011
Messages
4,546
Office Version
  1. 2007
Platform
  1. Windows
I was going through some excel files today to do some harddrive clean up and I stumbled across an older excel file from last year that I made in response to a question from a member here.
The member asked if I knew of a faster way to have excel calculate 100k cells of SUMIFS formulas.

The formula in question is: =SUMIFS($D$2:$D$100001,$A$2:$A$100001,A2,$B$2:$B$100001,B2,$C$2:$C$100001,C2)
Column Data to be used:
Column A (Criteria 1) contains random single letters in the range A2:A100,001
Column B (Criteria 2) contains a string that is the Row() & Column() repeated after row 21 in the range B2:B100,001
Column C (Criteria 3) contains randomly generated numbers from 1 to 100 in the range C2:C100,001
Column D (Range to sum) contains randomly generated numbers from 1 to 10 in the range D2:D100,001


If you were to fill-down the formula from E2 to E100,001 it takes excel several minutes to calculate the results. My old laptop took over 12 1/2 minutes to complete it with the following code when I applied it to column F:
VBA Code:
    Application.ScreenUpdating = False
    Range("F2:F" & Range("A" & Rows.Count).End(xlUp).Row).Formula = "=SUMIFS(R2C4:R100001C4,R2C1:R100001C1,RC[-5],R2C2:R100001C2,RC[-4],R2C3:R100001C3,RC[-3])"
    Application.ScreenUpdating = True

The member that mentioned the challenge to me said that a solution had been offered by Tim Williams over at StackOverflow that spit out the results in under 1 second:
VBA Code:
Option Explicit
Sub SumCountAvg()
Dim arr, ws, dict, arrOut, arrValues, v, tmp, t, keyCols
Dim n As Long, i As Long, valueCol As Long, destCol As Long
Dim str As String, sep As String, rng As Range

 keyCols = Array(1, 2, 3) [I]'column(s) for the dictionary key[/I]
 valueCol = 4 [I]'column with values (for sum)[/I]
 destCol = 5 [I]'destination for calculated values[/I]
 t = Timer

 Set ws = ActiveSheet
 Set rng = ws.Range("A1").CurrentRegion
 n = rng.Rows.Count - 1
 Set rng = rng.Offset(1, 0).Resize(n) [I]'exclude headers[/I]

 [I]'build the formula to create the dictionary key[/I]
 For i = 0 To UBound(keyCols)
 str = str & sep & rng.Columns(keyCols(i)).Address
 sep = "&""|""&"
 Next i

 arr = ws.Evaluate(str) [I]'get an array of keys by evaluating the formula[/I]
 arrValues = rng.Columns(valueCol).Value [I]'values to be summed[/I]
 ReDim arrOut(1 To n, 1 To 1) [I]'output array[/I]

 Set dict = CreateObject("scripting.dictionary")
 [I]'first loop over the array counts the keys[/I]
 For i = 1 To n
 v = arr(i, 1)
 If Not dict.exists(v) Then dict(v) = Array(0, 0) [I]'count, sum[/I]
 tmp = dict(v) [I]'can't modify an array stored in a dictionary - pull it out first[/I]
 tmp(0) = tmp(0) + 1 [I]'increment count[/I]
 tmp(1) = tmp(1) + arrValues(i, 1) [I]'increment sum[/I]
 dict(v) = tmp [I]'return the modified array[/I]
 Next i

 [I]'second loop populates the output array from the dictionary[/I]
 For i = 1 To n
 arrOut(i, 1) = dict(arr(i, 1))(1) [I] [/I] [I]'sumifs[/I]
 'arrOut(i, 1) = dict(arr(i, 1))(0) [I] [/I] [I] [/I] [I]'cou[/I]n[I]tifs[/I]
 'arrOut(i, 1) = dict(arr(i, 1))(1) / dict(arr(i, 1))(0) [I]'averageifs[/I]
 Next i

 [I]'populate the results[/I]
 rng.Columns(destCol).Value = arrOut
 MsgBox "Checked " & n & " rows in " & Timer - t & " seconds"[/B]
 [B]End Sub

I was just wondering if anyone here wanted to take a stab at the challenge to see if they could come up with any other solutions that may be faster/as fast/almost as fast.

I will offer up some code to set up a test file that you can play with if you want to take a stab at it.
VBA Code:
Sub GenerateRandomInfoColumnsA_Thru_D()
'
    Dim HeaderArray As Variant
'
    Application.ScreenUpdating = False                                                              ' Turn ScreenUpdating off
    Application.Calculation = xlManual                                                              ' Turn Calculation mode to manual
'
    HeaderArray = Array("Criteria 1", "Criteria 2", "Criteria 3", "Value Range To Sum", "Results")  ' Create Array of Headers
'
    Sheets("Sheet1").UsedRange.Clear                                                                ' Clear the sheet to be used
'
    Sheets("Sheet1").Range("A1").Resize(, UBound(HeaderArray) + 1) = HeaderArray                    ' Display Header to Sheet
'
'   ColumnARandomLetters
    With Sheets("Sheet1").Range("A2:A100001")                                                       ' With Range("A2:A100001") ...
        .Formula = "=CHAR(RANDBETWEEN(97,122))"                                                     '   Generate lowercase letters a - z
        .Value = .Value                                                                             '   Remove formulas, leave values
    End With
'
'   Column_B_RowNumberColumnNumber
    With Sheets("Sheet1").Range("B2:B100001")                                                       ' With Range("B2:B100001") ...
        .Formula = "=Row() & ""2"""                                                                 '   Save Row# & '2' to cell
        .Value = .Value                                                                             '   Remove formulas, leave values
    End With
'
'   ColumnCRandomNumbers1to100
    With Sheets("Sheet1").Range("C2:C100001")                                                       ' With Range("C2:C100001") ...
        .Formula = "=RANDBETWEEN(1,100)"                                                            '   Generate random whole number between 1 - 100
        .Value = .Value                                                                             '   Remove formulas, leave values
    End With
'
'   ColumnDRandomNumbers1to10
    With Sheets("Sheet1").Range("D2:D100001")                                                       ' With Range("D2:D100001") ...
        .Formula = "=RANDBETWEEN(1,10)"                                                             '   Generate random whole number between 1 - 10
        .Value = .Value                                                                             '   Remove formulas, leave values
    End With
'
'   Set Columns B Thru D to text
    Sheets("Sheet1").Columns("B:D").NumberFormat = "@"                                              '   Set Columns B:D to text
'
    Sheets("Sheet1").Columns.AutoFit
'
    Application.Calculation = xlAutomatic                                                           ' Turn Calculation back on
    Application.ScreenUpdating = True                                                               ' Turn ScreenUpdating back on
End Sub
 
Edited version

VBA Code:
Option Explicit
Dim i As Long, ar2, ar3, ar4, ar5, ar6

Sub A_Test_Speed_Sumifs_2()
    Application.ScreenUpdating = False
    Dim t As Double: t = Timer
    
    Dim ar1(1 To 100000, 1 To 1)
    For i = 1 To 100000
        ar1(i, 1) = i
    Next i
    
    ar2 = Range("A2:C100001")
    
    ReDim ar3(1 To 100000, 1 To 1)
    For i = 1 To 100000
        ar3(i, 1) = ar2(i, 1) & " | " & ar2(i, 2) & " | " & ar2(i, 3)
    Next i
    
    ar4 = Range("D2:D100001")
    
    ReDim ar5(1 To 100000, 1 To 3)
    For i = 1 To 100000
        ar5(i, 1) = ar1(i, 1)
        ar5(i, 2) = ar3(i, 1)
        ar5(i, 3) = ar4(i, 1)
    Next i
    
    With Range("F2:H100001")
        .Value = ar5
        .Sort Key1:=Range("G2"), order1:=xlAscending
    End With
    
    ReDim ar6(1 To 100001, 1 To 4)
    ar6 = Range("F1:I100001")
    
    For i = 2 To 100001
        If ar6(i, 2) = ar6(i - 1, 2) Then ar6(i, 4) = ar6(i, 3) + ar6(i - 1, 4) Else ar6(i, 4) = ar6(i, 3)
    Next i
    
    For i = 100000 To 1 Step -1
        If ar6(i, 2) <> ar6(i + 1, 2) Then ar6(i, 4) = ar6(i, 4) Else ar6(i, 4) = ar6(i + 1, 4)
    Next i
    
    With Range("F2:I100002")
        .Value = ar6
        .Sort Key1:=Range("F2"), order1:=xlAscending
    End With
    Range("I2:I100001").Copy Range("E2")
    
    Range("F:I").EntireColumn.ClearContents
    
    Application.ScreenUpdating = True
    MsgBox Timer - t & " secs."
End Sub
 
Upvote 0

Excel Facts

Which came first: VisiCalc or Lotus 1-2-3?
Dan Bricklin and Bob Frankston debuted VisiCalc in 1979 as a Visible Calculator. Lotus 1-2-3 debuted in the early 1980's, from Mitch Kapor.
OK @johnnyL , it doesn’t look like much interest in this

I think I see one potential reason why. The code posted for Tim had excess characters that make it not so friendly to copy/paste to excel.

My apologies, Tim's code example should have been posted as the following:
VBA Code:
Sub SumCountAvg()
Dim arr, ws, dict, arrOut, arrValues, v, tmp, t, keyCols
Dim n As Long, i As Long, valueCol As Long, destCol As Long
Dim str As String, sep As String, rng As Range
'
 keyCols = Array(1, 2, 3) 'column(s) for the dictionary key
 valueCol = 4 'column with values (for sum)
 destCol = 5 'destination for calculated values
 t = Timer

 Set ws = ActiveSheet
 Set rng = ws.Range("A1").CurrentRegion
 n = rng.Rows.Count - 1
 Set rng = rng.Offset(1, 0).Resize(n) 'exclude headers

 'build the formula to create the dictionary key
 For i = 0 To UBound(keyCols)
 str = str & sep & rng.Columns(keyCols(i)).Address
 sep = "&""|""&"
 Next i

 arr = ws.Evaluate(str) 'get an array of keys by evaluating the formula
 arrValues = rng.Columns(valueCol).Value 'values to be summed
 ReDim arrOut(1 To n, 1 To 1) 'output array

 Set dict = CreateObject("scripting.dictionary")
 'first loop over the array counts the keys
 For i = 1 To n
 v = arr(i, 1)
 If Not dict.exists(v) Then dict(v) = Array(0, 0) 'count, sum
 tmp = dict(v) 'can't modify an array stored in a dictionary - pull it out first
 tmp(0) = tmp(0) + 1 'increment count
 tmp(1) = tmp(1) + arrValues(i, 1) 'increment sum
 dict(v) = tmp 'return the modified array
 Next i

 'second loop populates the output array from the dictionary
 For i = 1 To n
 arrOut(i, 1) = dict(arr(i, 1))(1) 'sumifs
 'arrOut(i, 1) = dict(arr(i, 1))(0) 'countifs
 'arrOut(i, 1) = dict(arr(i, 1))(1) / dict(arr(i, 1))(0) 'averageifs
 Next i

'populate the results
 rng.Columns(destCol).Value = arrOut
 MsgBox "Checked " & n & " rows in " & Timer - t & " seconds"
End Sub
 
Upvote 0
Thanks for looking at it @bebo021999.
Did you check the timing of Tim's code before you started your attempt? How close was the timing of the code you came up with?
it took 0.9s - 1.1s for 100k rows
and 12s for 1 milion rows
I believe that's fastest solution.
 
Upvote 0
I think this is the main part that make hid code different and fast
VBA Code:
For i = 0 To UBound(keyCols)
 str = str & sep & rng.Columns(keyCols(i)).Address
 sep = "&""|""&"
 Next i
 arr = ws.Evaluate(str)
to combine 3 columns (i=1 to 3), without loop, to read address, then evaluate to get value
 
Upvote 0
I thought I’d put in my 2 cents’ worth. It was pointless replicating Tim’s method of double looping through a dictionary, so I thought I’d have a go at a non-dictionary method.

And yes, I know it isn’t as fast as Tim’s method – I did it purely for fun. I averaged Tim’s method – using the data generated by your code – at ~1.2 seconds. Using the same data, my non-dictionary method I averaged at ~2.2 seconds. Not great, but considerably better than the 12 ½ minutes you quoted in post #1.

I took a look at your code and tried to condense some of it. See if it works faster for you. I doubt it will get down to the current record time, but I did delete a few loops and arrays so hopefully it will work a bit faster for you.

VBA Code:
Sub A_Test_Speed_Sumifs_2_Mod2()
'
    Dim t As Double: t = Timer
'
    Dim ArrayRow                            As Long
    Dim DataArray                           As Variant
    Dim FinalDataArray                      As Variant
    Dim NewDataArray(1 To 100000, 1 To 3)   As Variant
'
    Application.ScreenUpdating = False                                                  ' Turn ScreenUpdating off
'
    DataArray = Range("A2:D100001")                                                     ' Save Data from A2:D100001 into DataArray
'
    For ArrayRow = 1 To 100000                                                          ' Loop through rows
        NewDataArray(ArrayRow, 1) = ArrayRow                                            '   Row #
        NewDataArray(ArrayRow, 2) = DataArray(ArrayRow, 1) & " | " & _
                DataArray(ArrayRow, 2) & " | " & DataArray(ArrayRow, 3)                 '   Criteria columns Concat string
        NewDataArray(ArrayRow, 3) = DataArray(ArrayRow, 4)                              '   Sum Range values
    Next                                                                                ' Loop back
'
    With Range("F2:H100001")
        .Value = NewDataArray                                                           '   Write NewDataArray to sheet F2:H100001
        .Sort Key1:=Range("G2"), order1:=xlAscending                                    '   Sort range by Criteria Concat strings
    End With
'
    FinalDataArray = Range("F1:I100001")                                                ' Save Data from F1:I100001 into FinalDataArray
'
    For ArrayRow = 2 To 100001                                                          ' Loop through rows of FinalDataArray
        If FinalDataArray(ArrayRow, 2) = FinalDataArray(ArrayRow - 1, 2) Then           '   If duplicate Criteria Concat strings found then ...
            FinalDataArray(ArrayRow, 4) = FinalDataArray(ArrayRow, 3) + _
                    FinalDataArray(ArrayRow - 1, 4)                                     '       Add sum Ranges together & save to FinalDataArray(ArrayRow, 4)
        Else                                                                            '   Else ...
            FinalDataArray(ArrayRow, 4) = FinalDataArray(ArrayRow, 3)                   '       Save single sum value to FinalDataArray(ArrayRow, 4)
        End If
    Next                                                                                ' Loop back
'
    For ArrayRow = 100000 To 1 Step -1                                                  ' Loop backwards through rows of FinalDataArray
        If FinalDataArray(ArrayRow, 2) = FinalDataArray(ArrayRow + 1, 2) Then           '   If Criteria Concat string = next Criteria Concat string then ...
            FinalDataArray(ArrayRow, 4) = FinalDataArray(ArrayRow + 1, 4)               '       Save next sum count to this sum count
        End If
    Next                                                                                ' Loop back
'
    With Range("F2:I100001")
        .Value = FinalDataArray                                                         '   Write FinalDataArray to sheet F:I
        .Sort Key1:=Range("F2"), order1:=xlAscending                                    '   Sort Row #s
    End With
'
    Range("I2:I100001").Copy Range("E2")                                                ' Copy sum range values to Column E of sheet ... 0.07734375
'
    Columns("F:I").Delete                                                               ' Delete helper columns
'
    Application.ScreenUpdating = True                                                   ' Turn ScreenUpdating back on
'
    Debug.Print Timer - t & " secs."                                                    ' Display elapsed time
    MsgBox Timer - t & " secs."                                                         ' Display elapsed time
End Sub
 
Upvote 0
I took a look at your code and tried to condense some of it. See if it works faster for you. I doubt it will get down to the current record time, but I did delete a few loops and arrays so hopefully it will work a bit faster for you.

VBA Code:
Sub A_Test_Speed_Sumifs_2_Mod2()
'
    Dim t As Double: t = Timer
'
    Dim ArrayRow                            As Long
    Dim DataArray                           As Variant
    Dim FinalDataArray                      As Variant
    Dim NewDataArray(1 To 100000, 1 To 3)   As Variant
'
    Application.ScreenUpdating = False                                                  ' Turn ScreenUpdating off
'
    DataArray = Range("A2:D100001")                                                     ' Save Data from A2:D100001 into DataArray
'
    For ArrayRow = 1 To 100000                                                          ' Loop through rows
        NewDataArray(ArrayRow, 1) = ArrayRow                                            '   Row #
        NewDataArray(ArrayRow, 2) = DataArray(ArrayRow, 1) & " | " & _
                DataArray(ArrayRow, 2) & " | " & DataArray(ArrayRow, 3)                 '   Criteria columns Concat string
        NewDataArray(ArrayRow, 3) = DataArray(ArrayRow, 4)                              '   Sum Range values
    Next                                                                                ' Loop back
'
    With Range("F2:H100001")
        .Value = NewDataArray                                                           '   Write NewDataArray to sheet F2:H100001
        .Sort Key1:=Range("G2"), order1:=xlAscending                                    '   Sort range by Criteria Concat strings
    End With
'
    FinalDataArray = Range("F1:I100001")                                                ' Save Data from F1:I100001 into FinalDataArray
'
    For ArrayRow = 2 To 100001                                                          ' Loop through rows of FinalDataArray
        If FinalDataArray(ArrayRow, 2) = FinalDataArray(ArrayRow - 1, 2) Then           '   If duplicate Criteria Concat strings found then ...
            FinalDataArray(ArrayRow, 4) = FinalDataArray(ArrayRow, 3) + _
                    FinalDataArray(ArrayRow - 1, 4)                                     '       Add sum Ranges together & save to FinalDataArray(ArrayRow, 4)
        Else                                                                            '   Else ...
            FinalDataArray(ArrayRow, 4) = FinalDataArray(ArrayRow, 3)                   '       Save single sum value to FinalDataArray(ArrayRow, 4)
        End If
    Next                                                                                ' Loop back
'
    For ArrayRow = 100000 To 1 Step -1                                                  ' Loop backwards through rows of FinalDataArray
        If FinalDataArray(ArrayRow, 2) = FinalDataArray(ArrayRow + 1, 2) Then           '   If Criteria Concat string = next Criteria Concat string then ...
            FinalDataArray(ArrayRow, 4) = FinalDataArray(ArrayRow + 1, 4)               '       Save next sum count to this sum count
        End If
    Next                                                                                ' Loop back
'
    With Range("F2:I100001")
        .Value = FinalDataArray                                                         '   Write FinalDataArray to sheet F:I
        .Sort Key1:=Range("F2"), order1:=xlAscending                                    '   Sort Row #s
    End With
'
    Range("I2:I100001").Copy Range("E2")                                                ' Copy sum range values to Column E of sheet ... 0.07734375
'
    Columns("F:I").Delete                                                               ' Delete helper columns
'
    Application.ScreenUpdating = True                                                   ' Turn ScreenUpdating back on
'
    Debug.Print Timer - t & " secs."                                                    ' Display elapsed time
    MsgBox Timer - t & " secs."                                                         ' Display elapsed time
End Sub
Over multiple runs with the same data, it averaged on my laptop at ~2.1 seconds v ~2.2 seconds - so a saving of ~0.1 seconds. Tim's version still reigns supreme, as I don't see anyone else putting their hat in the ring ;)
Thanks Johnny
 
Upvote 0

Forum statistics

Threads
1,214,935
Messages
6,122,337
Members
449,078
Latest member
skydd

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top