SUMIFS Coding challenge has been issued!

johnnyL

Well-known Member
Joined
Nov 7, 2011
Messages
4,546
Office Version
  1. 2007
Platform
  1. Windows
I was going through some excel files today to do some harddrive clean up and I stumbled across an older excel file from last year that I made in response to a question from a member here.
The member asked if I knew of a faster way to have excel calculate 100k cells of SUMIFS formulas.

The formula in question is: =SUMIFS($D$2:$D$100001,$A$2:$A$100001,A2,$B$2:$B$100001,B2,$C$2:$C$100001,C2)
Column Data to be used:
Column A (Criteria 1) contains random single letters in the range A2:A100,001
Column B (Criteria 2) contains a string that is the Row() & Column() repeated after row 21 in the range B2:B100,001
Column C (Criteria 3) contains randomly generated numbers from 1 to 100 in the range C2:C100,001
Column D (Range to sum) contains randomly generated numbers from 1 to 10 in the range D2:D100,001


If you were to fill-down the formula from E2 to E100,001 it takes excel several minutes to calculate the results. My old laptop took over 12 1/2 minutes to complete it with the following code when I applied it to column F:
VBA Code:
    Application.ScreenUpdating = False
    Range("F2:F" & Range("A" & Rows.Count).End(xlUp).Row).Formula = "=SUMIFS(R2C4:R100001C4,R2C1:R100001C1,RC[-5],R2C2:R100001C2,RC[-4],R2C3:R100001C3,RC[-3])"
    Application.ScreenUpdating = True

The member that mentioned the challenge to me said that a solution had been offered by Tim Williams over at StackOverflow that spit out the results in under 1 second:
VBA Code:
Option Explicit
Sub SumCountAvg()
Dim arr, ws, dict, arrOut, arrValues, v, tmp, t, keyCols
Dim n As Long, i As Long, valueCol As Long, destCol As Long
Dim str As String, sep As String, rng As Range

 keyCols = Array(1, 2, 3) [I]'column(s) for the dictionary key[/I]
 valueCol = 4 [I]'column with values (for sum)[/I]
 destCol = 5 [I]'destination for calculated values[/I]
 t = Timer

 Set ws = ActiveSheet
 Set rng = ws.Range("A1").CurrentRegion
 n = rng.Rows.Count - 1
 Set rng = rng.Offset(1, 0).Resize(n) [I]'exclude headers[/I]

 [I]'build the formula to create the dictionary key[/I]
 For i = 0 To UBound(keyCols)
 str = str & sep & rng.Columns(keyCols(i)).Address
 sep = "&""|""&"
 Next i

 arr = ws.Evaluate(str) [I]'get an array of keys by evaluating the formula[/I]
 arrValues = rng.Columns(valueCol).Value [I]'values to be summed[/I]
 ReDim arrOut(1 To n, 1 To 1) [I]'output array[/I]

 Set dict = CreateObject("scripting.dictionary")
 [I]'first loop over the array counts the keys[/I]
 For i = 1 To n
 v = arr(i, 1)
 If Not dict.exists(v) Then dict(v) = Array(0, 0) [I]'count, sum[/I]
 tmp = dict(v) [I]'can't modify an array stored in a dictionary - pull it out first[/I]
 tmp(0) = tmp(0) + 1 [I]'increment count[/I]
 tmp(1) = tmp(1) + arrValues(i, 1) [I]'increment sum[/I]
 dict(v) = tmp [I]'return the modified array[/I]
 Next i

 [I]'second loop populates the output array from the dictionary[/I]
 For i = 1 To n
 arrOut(i, 1) = dict(arr(i, 1))(1) [I] [/I] [I]'sumifs[/I]
 'arrOut(i, 1) = dict(arr(i, 1))(0) [I] [/I] [I] [/I] [I]'cou[/I]n[I]tifs[/I]
 'arrOut(i, 1) = dict(arr(i, 1))(1) / dict(arr(i, 1))(0) [I]'averageifs[/I]
 Next i

 [I]'populate the results[/I]
 rng.Columns(destCol).Value = arrOut
 MsgBox "Checked " & n & " rows in " & Timer - t & " seconds"[/B]
 [B]End Sub

I was just wondering if anyone here wanted to take a stab at the challenge to see if they could come up with any other solutions that may be faster/as fast/almost as fast.

I will offer up some code to set up a test file that you can play with if you want to take a stab at it.
VBA Code:
Sub GenerateRandomInfoColumnsA_Thru_D()
'
    Dim HeaderArray As Variant
'
    Application.ScreenUpdating = False                                                              ' Turn ScreenUpdating off
    Application.Calculation = xlManual                                                              ' Turn Calculation mode to manual
'
    HeaderArray = Array("Criteria 1", "Criteria 2", "Criteria 3", "Value Range To Sum", "Results")  ' Create Array of Headers
'
    Sheets("Sheet1").UsedRange.Clear                                                                ' Clear the sheet to be used
'
    Sheets("Sheet1").Range("A1").Resize(, UBound(HeaderArray) + 1) = HeaderArray                    ' Display Header to Sheet
'
'   ColumnARandomLetters
    With Sheets("Sheet1").Range("A2:A100001")                                                       ' With Range("A2:A100001") ...
        .Formula = "=CHAR(RANDBETWEEN(97,122))"                                                     '   Generate lowercase letters a - z
        .Value = .Value                                                                             '   Remove formulas, leave values
    End With
'
'   Column_B_RowNumberColumnNumber
    With Sheets("Sheet1").Range("B2:B100001")                                                       ' With Range("B2:B100001") ...
        .Formula = "=Row() & ""2"""                                                                 '   Save Row# & '2' to cell
        .Value = .Value                                                                             '   Remove formulas, leave values
    End With
'
'   ColumnCRandomNumbers1to100
    With Sheets("Sheet1").Range("C2:C100001")                                                       ' With Range("C2:C100001") ...
        .Formula = "=RANDBETWEEN(1,100)"                                                            '   Generate random whole number between 1 - 100
        .Value = .Value                                                                             '   Remove formulas, leave values
    End With
'
'   ColumnDRandomNumbers1to10
    With Sheets("Sheet1").Range("D2:D100001")                                                       ' With Range("D2:D100001") ...
        .Formula = "=RANDBETWEEN(1,10)"                                                             '   Generate random whole number between 1 - 10
        .Value = .Value                                                                             '   Remove formulas, leave values
    End With
'
'   Set Columns B Thru D to text
    Sheets("Sheet1").Columns("B:D").NumberFormat = "@"                                              '   Set Columns B:D to text
'
    Sheets("Sheet1").Columns.AutoFit
'
    Application.Calculation = xlAutomatic                                                           ' Turn Calculation back on
    Application.ScreenUpdating = True                                                               ' Turn ScreenUpdating back on
End Sub
 

Excel Facts

Waterfall charts in Excel?
Office 365 customers have access to Waterfall charts since late 2016. They were added to Excel 2019.
Apologies. :( I noticed an error in the setup of the data code. I failed to notice one of the stipulations.
Column B (Criteria 2) contains a string that is the Row() & Column() repeated after row 21 in the range B2:B100,001

After noticing that, I decided to redo the code that makes the proper correction & shortens up the code needed to set up the data needed for this challenge.
VBA Code:
Sub GenerateRandomInfoColumnsA_Thru_D_CorrectedAndShortened()                                       ' This takes only a couple of seconds to generate
'
    Dim HeaderArray As Variant
'
    Application.ScreenUpdating = False                                                              ' Turn ScreenUpdating off
    Application.Calculation = xlManual                                                              ' Turn Calculation mode to manual
'
    HeaderArray = Array("Criteria 1", "Criteria 2", "Criteria 3", "Value Range To Sum", "Results")  ' Create Array of Headers
'
    With Sheets("Sheet1")
        .UsedRange.Clear                                                                            '   Clear the sheet to be used
        .Range("A1").Resize(, UBound(HeaderArray) + 1) = HeaderArray                                '   Display Header to Sheet
        .Range("A2:A100001").Formula = "=CHAR(RANDBETWEEN(97,122))"                                 '   Generate random lowercase letters a - z in Column A
        .Range("B2:B100001").Formula = "=MOD(ROW()-2,20)+2 & ""2"""                                 '   Save Row#'s 2-21 & '2' to cell
        .Range("C2:C100001").Formula = "=RANDBETWEEN(1,100)"                                        '   Generate random whole number between 1 - 100
        .Range("D2:D100001").Formula = "=RANDBETWEEN(1,10)"                                         '   Generate random whole number between 1 - 10
        .Range("A2:D100001").Value = .Range("A2:D100001").Value                                     '   Remove formulas, leave values
'
        .Columns("B:D").NumberFormat = "@"                                                          '   Set Columns B:D to text
        .Columns.AutoFit                                                                            '   Autofit all used columns on the sheet
    End With
'
    Application.Calculation = xlAutomatic                                                           ' Turn Calculation back on
    Application.ScreenUpdating = True                                                               ' Turn ScreenUpdating back on
End Sub

This code is all 'no loop' code so it should run in just a couple of seconds to establish the data needed for the challenge.

Let me know if there are any questions. Game on!
 
Upvote 0
Are you asking where the SUMIFS formula should be placed?

If you are asking what the SUMIFS formula is, that was in post #1:
The formula in question is: =SUMIFS($D$2:$D$100001,$A$2:$A$100001,A2,$B$2:$B$100001,B2,$C$2:$C$100001,C2)
 
Upvote 0
OK, I got your idea.
I think the solution from Tim is the best, for the SUMIFS.
Although there were 2 loops from 1 to 100000, but all deal with array and dictionary, not on physical sheet.
I am testing with 1 milion rows, try to dig in his code to see if it could be better.
 
Upvote 0
I tried but nothing better found

good news: have learnt a lot from your sharing.
 
Upvote 0
Thanks for looking at it @bebo021999.

Just an FYI. Anyone that accepts this challenge should run the code from Tim Williams first in order to get a time frame to shoot for. It was reported that Tim's code ran in under a second, but in my testing environment it was closer to 4 seconds.

Timing is relative to the actual environment it is tested in.

@bebo021999 Did you check the timing of Tim's code before you started your attempt? How close was the timing of the code you came up with?
 
Upvote 0
OK @johnnyL , it doesn’t look like much interest in this, so I thought I’d put in my 2 cents’ worth. It was pointless replicating Tim’s method of double looping through a dictionary, so I thought I’d have a go at a non-dictionary method.

And yes, I know it isn’t as fast as Tim’s method – I did it purely for fun. I averaged Tim’s method – using the data generated by your code – at ~1.2 seconds. Using the same data, my non-dictionary method I averaged at ~2.2 seconds. Not great, but considerably better than the 12 ½ minutes you quoted in post #1.

At least I had a go…:)

Needs an edit!

VBA Code:
Option Explicit
Dim i As Long, ar2, ar3, ar4, ar5, ar6

Sub A_Test_Speed_Sumifs()
    Application.ScreenUpdating = False
    Dim t As Double: t = Timer
   
    Dim ar1(1 To 100000, 1 To 1)
    For i = 1 To 100000
        ar1(i, 1) = i
    Next i
   
    ar2 = Range("A2:C100001")
   
    ReDim ar3(1 To 100000, 1 To 1)
    For i = 1 To 100000
        ar3(i, 1) = ar2(i, 1) & " | " & ar2(i, 2) & " | " & ar2(i, 3)
    Next i
   
    ar4 = Range("D2:D100001")
   
    ReDim ar5(1 To 100000, 1 To 3)
    For i = 1 To 100000
        ar5(i, 1) = ar1(i, 1)
        ar5(i, 2) = ar3(i, 1)
        ar5(i, 3) = ar4(i, 1)
    Next i
   
    With Range("F2:H100001")
        .Value = ar5
        .Sort Key1:=Range("G2"), order1:=xlAscending
    End With
   
    ReDim ar6(1 To 100001, 1 To 4)
    ar6 = Range("F1:I100001")
   
    For i = 2 To 100001
        If ar6(i, 2) = ar6(i - 1, 2) Then ar6(i, 4) = ar6(i, 3) + ar6(i - 1, 4) Else ar6(i, 4) = ar6(i, 3)
    Next i
   
    For i = 100000 To 1 Step -1
        If ar6(i, 2) <> ar6(i + 1, 2) Then ar6(i, 4) = ar6(i, 4) Else ar6(i, 4) = ar6(i + 1, 4)
    Next i
   
    With Range("F2:I100001")
        .Value = ar6
        .Sort Key1:=Range("F2"), order1:=xlAscending
    End With
    Range("I2:I100001").Copy Range("E2")
   
    Range("F:I").EntireColumn.ClearContents
   
    Application.ScreenUpdating = True
    MsgBox Timer - t & " secs."
End Sub
 
Upvote 0

Forum statistics

Threads
1,215,273
Messages
6,123,985
Members
449,137
Latest member
abdahsankhan

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top