Which histogram formula is correct?

macfuller

Active Member
Joined
Apr 30, 2014
Messages
251
Poking around on the web I find various formulas for creating histograms. I'm getting different results for each formula and I want to know which one is accurate.

I have a table of purchase orders and I want to get the distribution of how many POs are in various dollar ranges (e.g. how many are for between $100 and $500). In the Orders table each purchase order can have multiple lines, which add up to the total value of the purchase order. So I need to sum [Extended Merch Amt] for all the lines with the same Orders[PO No.] value to get the full PO value. My base measures are:
Code:
PO Spend:=SUM ( Orders[Extended Merch Amt] )
Code:
PO Count:=DISTINCTCOUNT ( Orders[PO No.] )
The competing histogram measures are:
Code:
PO Count Distribution:=CALCULATE (    [PO Count],
    FILTER (
        Orders,
        AND (
            [PO Spend] >= MIN ( tblDollarRanges[Min] ),
            [PO Spend] < MAX ( tblDollarRanges[Max] )
        )
    )
)
Code:
PO Distribution Count:=CALCULATE (    [PO Count],
    FILTER (
        VALUES ( Orders[PO No.] ),
        COUNTROWS (
            FILTER (
                tblDollarRanges,
                [PO Spend] >= tblDollarRanges[Min]
                    && [PO Spend] < tblDollarRanges[Max]
            )
        )
    )
)
And my results are:
LabelPO Count DistributionPO Distribution Count
Up to $1089,4267,557
$10 to $50133,53634,782
$50 to $10099,60432,989
$100 to $500144,745112,692
$500 to $1,00050,10350,861
$1,000 to $5,00055,33265,079
$5,000 to $10,0008,85112,483
$10,000 to $50,0006,5339,261
$50,000 to $100,0009431,100
$100,000 to $500,0008281,021
$500,000 to $1 million107125
$1 million to $5 million116121
$5 million to $10 million2222
$10 million to $100 million99
$100 million +11
Grand Total328,103328,103

<tbody>
</tbody>

The rows for PO Count Distribution add up to more than the Grand Total so I suspect the PO Distribution Count is the accurate one since I'm doing the VALUES(Orders[PO No.]) but I like the simplicity of PO Count Distribution and am wondering if there's a way to make it work better. Any insight as to why they work as they do, and if there's a simpler solution? Thanks.
 
Last edited:

Matt Allington

MrExcel MVP
Joined
Dec 18, 2014
Messages
1,189
The first one is counting the line level items, not the aggregation of all lines in a PO. Have you tried taking the first one, and replace “orders” with “values(orders[PO no.])”
 

Forum statistics

Threads
1,081,767
Messages
5,361,164
Members
400,617
Latest member
barron1

Some videos you may like

This Week's Hot Topics

  • populate from drop list with multiple tables
    Hi All, i have a drop list that displays data, what i want is when i select one of those from the list to populate text from different tables on...
  • Find list of words from sheet2 in sheet1 before a comma and extract text vba
    Hi Friends, Trying to find the solution on my task. But did not find suitable one to the need. Here is my query and sample file with details...
  • Dynamic Formula entry - VBA code sought
    Hello, really hope one of you experts can help with this - i've spent hours on this and getting no-where. .I have a set of data (more rows than...
  • Listbox Header
    Have a named range called "AccidentsHeader" Within my code I have: [CODE]Private Sub CommandButton1_Click() ListBox1.RowSource =...
  • Complex Heat Map using conditional formatting
    Good day excel world. I have a concern. Below link have a list of countries that carries each country unique data. [URL...
  • Conditional formatting
    Hi good morning, hope you can help me please, I have cells P4:P54 and if this cell is equal to 1 then i want row O to say "Fully Utilised" and to...
Top