Performance of Formulas for Unique Count

Thanks Thanks:  0
Likes Likes:  0
Results 1 to 4 of 4

Thread: Performance of Formulas for Unique Count

  1. #1
    MrExcel MVP Aladin Akyurek's Avatar
    Join Date
    Feb 2002
    Location
    The Hague, NL
    Posts
    81,487
    Post Thanks / Like
    Mentioned
    10 Post(s)
    Tagged
    0 Thread(s)

    Default Performance of Formulas for Unique Count

     
    What follows is triggered by a recent thread on unique count:

    http://www.mrexcel.com/board2/viewtopic.php?t=295052

    ******** ******************** ************************************************************************>
    Microsoft Excel - Performance of Formulas for Unique Count.xls___Running: 11.0 : OS = Windows XP
    File Edit View Insert Options Tools Data Window Help About
    =

    A
    B
    C
    D
    E
    F
    1
    {=COUNTDIFF(UPPER(A2:A13225),TRUE,"")}
    2
    Calc Time(Millisec) %BookMicroSecs
    3
    Re CalcFull CalcVolatileOvHead/Formula
    4
    0.057.10.1%5.257,093.6
    5
    0.048.90.1%None48,939.7
    6
    0.048.80.1%None48,771.8
    7
    MEDIAN0.048.90.1%515.9%48,939.7
    8
    9
    {=SUM(IF(A2:A13225<>"",1/COUNTIF(A2:A13225,A2:A13225)))}
    10
    Calc Time(Millisec) %BookMicroSecs
    11
    Re CalcFull CalcVolatileOvHead/Formula
    12
    0.08,537.20.0%75.98,537,211.3
    13
    0.08,524.50.0%None8,524,544.3
    14
    0.08,417.50.0%15.68,417,450.5
    15
    MEDIAN0.08,524.50.0%45.78,524,544.3
    16
    17
    =SUMPRODUCT((A2:A13225<>"")/COUNTIF(A2:A13225,A2:A13225&""))
    18
    Calc Time(Millisec) %BookMicroSecs
    19
    Re CalcFull CalcVolatileOvHead/Formula
    20
    0.08,390.30.0%None8,390,289.5
    21
    0.08,394.90.0%1.38,394,944.9
    22
    0.08,500.70.0%14.88,500,686.0
    23
    MEDIAN0.08,394.90.0%8.18,394,944.9
    24
    25
    {=SUM(IF(COUNTIF(A2:A13225,A2:A13225)=0,"",
    1/COUNTIF(A2:A13225,A2:A13225)))}
    26
    SheetsCalc Time(Millisec) %BookMicroSecs
    27
    CountRe CalcFull CalcVolatileOvHead/Formula
    28
    30.016,890.60.0%85.616,890,555.1
    29
    30.016,938.90.0%None16,938,900.1
    30
    30.016,937.90.0%None16,937,873.7
    31
    MEDIAN0.016,937.90.0%85.616,937,873.7
    32
    33
    {=SUM(IF(FREQUENCY(IF(A2:A13225<>"",
    MATCH("~"&A2:A13225,A2:A13225&"",0)),
    ROW(A2:A13225)-ROW(A2)+1),1))}
    34
    Calc Time(Millisec) %BookMicroSecs
    35
    Re CalcFull CalcVolatileOvHead/Formula
    36
    0.0677.30.0%0.1677,286.6
    37
    0.0677.60.0%0.8677,571.3
    38
    0.0676.80.0%None676,776.0
    39
    MEDIAN0.0677.30.0%0.4677,286.6
    FastXL

    [HtmlMaker light Ver1.11] To see the formula in the cells just click on the cells hyperlink or click the Name box
    PLEASE DO NOT QUOTE THIS TABLE IMAGE ON SAME PAGE! OTHEWISE, ERROR OF JavaScript OCCUR.


    As the above timing results with FastExcel based on three runs show, the formula with COUNTDIFF is the winner, the one with FREQUENCY/MATCH (a recent addition to the set of formulas for unique/distinct count) the second best. The formula which is apparently posted on "JWalk" site, that is,

    {=SUM(IF(COUNTIF(A2:A13225,A2:A13225)=0,"",1/COUNTIF(A2:A13225,A2:A13225)))}

    is the worst performer.

    Note that COUNTDIFF is case-senstive. That is why it includes UPPER to simulate the others.

    Note also that the CountDiff formula will yield correct a correct result in face of special meaning chars like <, *, ? that could figure in strings. The formulas with CountIf won't. The formula with Frequency/Match while less sensitive to such chars, it's not immune against trailing "*".

  2. #2
    MrExcel MVP
    Join Date
    Mar 2004
    Location
    Canada
    Posts
    17,771
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default

    That's great! It's nice to have some hard data comparing the efficiency rate between the formulas. And, I had not realized that COUNTDIFF is case-sensitive. Thanks Aladin!

  3. #3
    Board Regular
    Join Date
    Jul 2012
    Posts
    161
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Performance of Formulas for Unique Count

    This is cool!
    Are there any performance differences between an ArrayFormula and just a Formula (or any different restrictions)?

  4. #4
    MrExcel MVP Aladin Akyurek's Avatar
    Join Date
    Feb 2002
    Location
    The Hague, NL
    Posts
    81,487
    Post Thanks / Like
    Mentioned
    10 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Performance of Formulas for Unique Count

      
    Quote Originally Posted by memar_one View Post
    This is cool!
    Are there any performance differences between an ArrayFormula and just a Formula (or any different restrictions)?
    Formulas processing range objects (ex. SumIf) are generally (but not always) faster than their counterparts processing array objects (ex. SumProduct or Sum\If)... Another performance affecting issue arises from the use of volatile functions like Indirect, Offset, etc.

    See also: www.decisionmodels.com/
    Assuming too much and qualifying too much are two faces of the same problem.

User Tag List

Like this thread? Share it with others

Like this thread? Share it with others

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

 

 
DMCA.com