Which histogram formula is correct?

macfuller · Jun 14, 2019

Poking around on the web I find various formulas for creating histograms. I'm getting different results for each formula and I want to know which one is accurate.

I have a table of purchase orders and I want to get the distribution of how many POs are in various dollar ranges (e.g. how many are for between $100 and $500). In the Orders table each purchase order can have multiple lines, which add up to the total value of the purchase order. So I need to sum [Extended Merch Amt] for all the lines with the same Orders[PO No.] value to get the full PO value. My base measures are:

Code:

PO Spend:=SUM ( Orders[Extended Merch Amt] )

Code:

PO Count:=DISTINCTCOUNT ( Orders[PO No.] )

The competing histogram measures are:

Code:

PO Count Distribution:=CALCULATE (    [PO Count],
    FILTER (
        Orders,
        AND (
            [PO Spend] >= MIN ( tblDollarRanges[Min] ),
            [PO Spend] < MAX ( tblDollarRanges[Max] )
        )
    )
)

Code:

PO Distribution Count:=CALCULATE (    [PO Count],
    FILTER (
        VALUES ( Orders[PO No.] ),
        COUNTROWS (
            FILTER (
                tblDollarRanges,
                [PO Spend] >= tblDollarRanges[Min]
                    && [PO Spend] < tblDollarRanges[Max]
            )
        )
    )
)

And my results are:

Label	PO Count Distribution	PO Distribution Count
Up to $10	89,426	7,557
$10 to $50	133,536	34,782
$50 to $100	99,604	32,989
$100 to $500	144,745	112,692
$500 to $1,000	50,103	50,861
$1,000 to $5,000	55,332	65,079
$5,000 to $10,000	8,851	12,483
$10,000 to $50,000	6,533	9,261
$50,000 to $100,000	943	1,100
$100,000 to $500,000	828	1,021
$500,000 to $1 million	107	125
$1 million to $5 million	116	121
$5 million to $10 million	22	22
$10 million to $100 million	9	9
$100 million +	1	1
Grand Total	328,103	328,103

The rows for PO Count Distribution add up to more than the Grand Total so I suspect the PO Distribution Count is the accurate one since I'm doing the VALUES(Orders[PO No.]) but I like the simplicity of PO Count Distribution and am wondering if there's a way to make it work better. Any insight as to why they work as they do, and if there's a simpler solution? Thanks.

Matt Allington · Jun 14, 2019

The first one is counting the line level items, not the aggregation of all lines in a PO. Have you tried taking the first one, and replace “orders” with “values(orders[PO no.])”

Which histogram formula is correct?

macfuller

Active Member

Excel Facts

Matt Allington

MrExcel MVP

Similar threads

Forum statistics

Share this page

Which histogram formula is correct?

macfuller

Active Member

Excel Facts

Matt Allington

MrExcel MVP

Similar threads

Forum statistics

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock