How to calculate the median for a rule based subset of a list (and for different groups of entries in that list)?

German0815

New Member
Joined
May 29, 2020
Messages
3
Office Version
  1. 365
Platform
  1. Windows
Hi to all of you!

I am trying to calculate the median for a rule based subset of a list and do that for different groups of entries in that list. Let me explain in detail:

I have a list of hundreds or thousands of entries with these 4 key columns:

ContractSubmarketSize in sqmRent in € per sqm
1District A50022
2District C1,25037
3District C75030
4District B1,00018
5District A4,00012
............

Now what I want to do is calculate the median rent of the most expensive 3% of contracts for every submarket.

The 3% are calculated by size. Let's say I have 100 contracts in submarket A with a total of 145,000 sqm. Then, I calculate 0.03*145,000 = 4,350 sqm. I then sort the contracts from most expensive to least expensive and look at how many contracts make up 4,350 sqm or more. Let's say the 5 most expensive contracts cross the threshold of 4350 sqm. I then calculate the median rent for those 5 contracts.

But let's say in submarket B the one single contract with the highest rent already makes up more than the 3% of all contracts (or the two most expensive contracts). Then, I want to calculate the median of the three most expensive contracts (which is automatically the second most expensive contract, I know :D ).


Now, imagine having to calculate this for 100 submarkets. This is why I was trying to automate that. My problem is, I don't know how to do this sorting and counting thing and then calculating a median for this specific number of contracts or if this is even possible in Excel at all. Does someone here know how to do this?

Kind regards,
German0815
 

Excel Facts

Square and cube roots
The =SQRT(25) is a square root. For a cube root, use =125^(1/3). For a fourth root, use =625^(1/4).
Hi,

I see you have Office 365 so chances are you have Get&Transform on your Ribbon. You might want to give it a try.
  1. Select your data range and press CTRL+ T to convert the range to a table. Call the table "tContracts"
  2. On the G&T tab choose from Data/Range, confirm you have headers
  3. This opens PowerQuery
  4. Group By on Contract, and use All Rows as aggregate. Name this field "All".
  5. Add a custom column, name it "median", in which you type this formula
    = List.Median ( List.FirstN ( Table.Column ( Table.Sort([All] , {"Size in sqm" , Order.Descending } ) , "Rent in € per sqm" ) , 3 ) )
  6. Save and load to Excel as table
Query looks like this (note I give explicit names to the applied steps):
Code:
let
    Source = Excel.CurrentWorkbook(){[Name="tContracts"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Size in sqm", type number}, {"Rent in € per sqm", Int64.Type}}),
    BufferContracts = Table.Buffer(#"Changed Type"),
    GroupOnContract = Table.Group(BufferContracts, {"Submarket"}, {{"All", each _, type table [Contract=number, Submarket=text, Size in sqm=number, #"Rent in € per sqm"=number]}}),
    MedianTop3 = Table.AddColumn(GroupOnContract, "Median", each
                         List.Median(
                            List.FirstN(
                               Table.Column(
                                  Table.Sort([All],{"Size in sqm", Order.Descending}),
                                "Rent in € per sqm"),
                            3)
                         )
                      )   
in
    MedianTop3

This code can be copied in the advanced editor in the PQ UI.
 
Upvote 0
It can also be done with a formula:

Book1
ABCDEFGHI
1ContractSubmarketSize in sqmRent in € per sqmSubmarketMedian of top x%x
21District A50022District A123%
32District C1,25037District B18
43District C75030
54District B1,00018
65District A4,00012
76District B1,00032
Sheet12
Cell Formulas
RangeFormula
H2:H3H2=IF(COUNTIFS(B:B,G2,C:C,">="&SUMIFS(C:C,B:B,G2)*I$2)>=3,MEDIAN(IF((B$2:B$100=G2)*(C$2:C$100>=SUMIFS(C:C,B:B,G2)*I$2),D$2:D$100)),LARGE(IF(B$2:B$100=G2,D$2:D$100),2))
Press CTRL+SHIFT+ENTER to enter array formulas surrounded with curly braces.


This is a very limited sample, but it should work on larger ranges.

Depending on what your update channel is for Excel 365, there are some new functions that would make that a bit simpler. I don't have them yet or I'd show that too.
 
Last edited:
Upvote 0
Hi,

I see you have Office 365 so chances are you have Get&Transform on your Ribbon. You might want to give it a try.
  1. Select your data range and press CTRL+ T to convert the range to a table. Call the table "tContracts"
  2. On the G&T tab choose from Data/Range, confirm you have headers
  3. This opens PowerQuery
  4. Group By on Contract, and use All Rows as aggregate. Name this field "All".
  5. Add a custom column, name it "median", in which you type this formula
    = List.Median ( List.FirstN ( Table.Column ( Table.Sort([All] , {"Size in sqm" , Order.Descending } ) , "Rent in € per sqm" ) , 3 ) )
  6. Save and load to Excel as table
Query looks like this (note I give explicit names to the applied steps):
Code:
let
    Source = Excel.CurrentWorkbook(){[Name="tContracts"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Size in sqm", type number}, {"Rent in € per sqm", Int64.Type}}),
    BufferContracts = Table.Buffer(#"Changed Type"),
    GroupOnContract = Table.Group(BufferContracts, {"Submarket"}, {{"All", each _, type table [Contract=number, Submarket=text, Size in sqm=number, #"Rent in € per sqm"=number]}}),
    MedianTop3 = Table.AddColumn(GroupOnContract, "Median", each
                         List.Median(
                            List.FirstN(
                               Table.Column(
                                  Table.Sort([All],{"Size in sqm", Order.Descending}),
                                "Rent in € per sqm"),
                            3)
                         )
                      )  
in
    MedianTop3

This code can be copied in the advanced editor in the PQ UI.
Hi GraH,

thank you for your very quick reply and help. :)

I was able to execute this, but sth is wrong with the results. I think you sorted by Size, although the sorting has to be by Rent per sqm, since I want to calculate the median of the contracts with the highest Rent per sqm.

I was also wondering whether this is always calculating the median of the first three or if it calculates the median of the top x, if the 3% rule applies. (?) Based on how I understand the code, it does not. This would still be very useful though, since often the 3% rule is irrelevant and then your solution would work.

I didn't think of Power Query at all, although I have worked with it before. Very interesting approach, since the queries can be used for dynamic data from sql databases as well by "hitting refresh".

I will also try the approach of Eric W.
 
Upvote 0
It can also be done with a formula:

Book1
ABCDEFGHI
1ContractSubmarketSize in sqmRent in € per sqmSubmarketMedian of top x%x
21District A50022District A123%
32District C1,25037District B18
43District C75030
54District B1,00018
65District A4,00012
76District B1,00032
Sheet12
Cell Formulas
RangeFormula
H2:H3H2=IF(COUNTIFS(B:B,G2,C:C,">="&SUMIFS(C:C,B:B,G2)*I$2)>=3,MEDIAN(IF((B$2:B$100=G2)*(C$2:C$100>=SUMIFS(C:C,B:B,G2)*I$2),D$2:D$100)),LARGE(IF(B$2:B$100=G2,D$2:D$100),2))
Press CTRL+SHIFT+ENTER to enter array formulas surrounded with curly braces.


This is a very limited sample, but it should work on larger ranges.

Depending on what your update channel is for Excel 365, there are some new functions that would make that a bit simpler. I don't have them yet or I'd show that too.
Hi Eric,

if I use this formula, the results are not correct, unfortunately. I am trying to understand why, but I couldn't understand the formula completely. It does give values that are either part of the existing rents or an average of two of them, so it does calculate a median. But it is not the one I need (I have calculated it as an array formula, so that shouldn't be the problem).
 
Upvote 0

Forum statistics

Threads
1,214,919
Messages
6,122,259
Members
449,075
Latest member
staticfluids

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top