Results 1 to 2 of 2

Thread: Which histogram formula is correct?
Thanks Thanks: 0 Likes Likes: 0

  1. #1
    Board Regular
    Join Date
    Apr 2014
    Location
    Mitten State
    Posts
    223
    Post Thanks / Like
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)

    Default Which histogram formula is correct?

    Poking around on the web I find various formulas for creating histograms. I'm getting different results for each formula and I want to know which one is accurate.

    I have a table of purchase orders and I want to get the distribution of how many POs are in various dollar ranges (e.g. how many are for between $100 and $500). In the Orders table each purchase order can have multiple lines, which add up to the total value of the purchase order. So I need to sum [Extended Merch Amt] for all the lines with the same Orders[PO No.] value to get the full PO value. My base measures are:
    Code:
    PO Spend:=SUM ( Orders[Extended Merch Amt] )
    Code:
    PO Count:=DISTINCTCOUNT ( Orders[PO No.] )
    The competing histogram measures are:
    Code:
    PO Count Distribution:=CALCULATE (    [PO Count],
        FILTER (
            Orders,
            AND (
                [PO Spend] >= MIN ( tblDollarRanges[Min] ),
                [PO Spend] < MAX ( tblDollarRanges[Max] )
            )
        )
    )
    Code:
    PO Distribution Count:=CALCULATE (    [PO Count],
        FILTER (
            VALUES ( Orders[PO No.] ),
            COUNTROWS (
                FILTER (
                    tblDollarRanges,
                    [PO Spend] >= tblDollarRanges[Min]
                        && [PO Spend] < tblDollarRanges[Max]
                )
            )
        )
    )
    And my results are:
    Label PO Count Distribution PO Distribution Count
    Up to $10 89,426 7,557
    $10 to $50 133,536 34,782
    $50 to $100 99,604 32,989
    $100 to $500 144,745 112,692
    $500 to $1,000 50,103 50,861
    $1,000 to $5,000 55,332 65,079
    $5,000 to $10,000 8,851 12,483
    $10,000 to $50,000 6,533 9,261
    $50,000 to $100,000 943 1,100
    $100,000 to $500,000 828 1,021
    $500,000 to $1 million 107 125
    $1 million to $5 million 116 121
    $5 million to $10 million 22 22
    $10 million to $100 million 9 9
    $100 million + 1 1
    Grand Total 328,103 328,103

    The rows for PO Count Distribution add up to more than the Grand Total so I suspect the PO Distribution Count is the accurate one since I'm doing the VALUES(Orders[PO No.]) but I like the simplicity of PO Count Distribution and am wondering if there's a way to make it work better. Any insight as to why they work as they do, and if there's a simpler solution? Thanks.
    Last edited by macfuller; Jun 14th, 2019 at 04:53 PM.

  2. #2
    MrExcel MVP
    Join Date
    Dec 2014
    Location
    Sydney, Australia
    Posts
    1,186
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Which histogram formula is correct?

    The first one is counting the line level items, not the aggregation of all lines in a PO. Have you tried taking the first one, and replace “orders” with “values(orders[PO no.])”
    Matt Allington is a professional Power BI Consultant, Power BI Trainer and Microsoft MVP based in Sydney Australia. Matt is author of the Power BI book Supercharge Power BI and the Power Pivot Book Supercharge Excel.
    What is Power BI?

Some videos you may like

User Tag List

Tags for this Thread

Like this thread? Share it with others

Like this thread? Share it with others

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •