Calculating 95th percentile on grouped data

dimsum1

New Member
Joined
Aug 30, 2014
Messages
6
Does anyone have a function that will calculate the 95th percentile (or any percentile for that matter) on grouped data? I see that there are numerous methods to calculate the percentiles on the entire table but I haven't found any that work correctly on grouped data. - Thanks
table1
LocationTypesampnameanaresult
Loc1WetSample1ana112
Loc1WetSample1ana23
Loc1WetSample1ana315
Loc1Wetsample2ana144
Loc1Wetsample2ana223
Loc1Wetsample2ana325
Loc1Wetsample2ana412
Loc1Drysample3ana144
Loc1Drysample3ana255
Loc1Drysample3ana376
Loc1Drysample4ana18
Loc1Drysample4ana257
Loc1Drysample4ana346
Loc1Drysample4ana499
Loc2Wetsample5ana1124
Loc2Wetsample5ana266
Loc2Wetsample5ana36
Loc2Wetsample6ana1789
Loc2Wetsample6ana288
Loc2Wetsample6ana3767
Loc2Wetsample6ana4565

<tbody>
</tbody>
grouping on location, type and ana the return would look as follows:
LocationTypeana95th
Loc1Wetana142.4
Loc1Wetana222
Loc1Wetana324.5
Loc1Wetana412
Loc1Dryana142.2
Loc1Dryana256.9
Loc1Dryana374.5
Loc1Dryana499
Loc2Wetana1755.75
Loc2Wetana286.9
Loc2Wetana3728.95
Loc2Wetana4565

<tbody>
</tbody>
 
Last edited:

Excel Facts

Highlight Duplicates
Home, Conditional Formatting, Highlight Cells, Duplicate records, OK to add pink formatting to any duplicates in selected range.
Confused, you want the 95% of each LOCATION? Wouldnt that be
LOC1 = 13 records
LOC2 = 6 records
?
It doesnt seem to be RESULT value.
 
Upvote 0

Excel 2010
ABCDEF
1LocationTypeanasampnameresult95th
2Loc1Wetana1Sample11242.4
3Loc1Wetana2Sample1322
4Loc1Wetana3Sample11524.5
5Loc1Wetana1sample24442.4
6Loc1Wetana2sample22322
7Loc1Wetana3sample22524.5
8Loc1Wetana4sample21212
9Loc1Dryana1sample34442.2
10Loc1Dryana2sample35556.9
11Loc1Dryana3sample37674.5
12Loc1Dryana1sample4842.2
13Loc1Dryana2sample45756.9
14Loc1Dryana3sample44674.5
15Loc1Dryana4sample49999
16Loc2Wetana1sample5124755.75
17Loc2Wetana2sample56686.9
18Loc2Wetana3sample56728.95
19Loc2Wetana1sample6789755.75
20Loc2Wetana2sample68886.9
21Loc2Wetana3sample6767728.95
22Loc2Wetana4sample6565565
Sheet1 (3)
Cell Formulas
RangeFormula
F2{=PERCENTILE((IF($A$2:$A$22=A2,IF($B$2:$B$22=B2,IF($C$2:$C$22=C2,$E$2:$E$22)))),0.95)}
Press CTRL+SHIFT+ENTER to enter array formulas.



Then apply the advanced filter, unique records only, on just the first three columns (I moved one over because it seems they have to be together):


Excel 2010
ABCDEF
1LocationTypeanasampnameresult95th
2Loc1Wetana1Sample11242.4
3Loc1Wetana2Sample1322
4Loc1Wetana3Sample11524.5
8Loc1Wetana4sample21212
9Loc1Dryana1sample34442.2
10Loc1Dryana2sample35556.9
11Loc1Dryana3sample37674.5
15Loc1Dryana4sample49999
16Loc2Wetana1sample5124755.75
17Loc2Wetana2sample56686.9
18Loc2Wetana3sample56728.95
22Loc2Wetana4sample6565565
Sheet1 (3)
Cell Formulas
RangeFormula
F2{=PERCENTILE((IF($A$2:$A$22=A2,IF($B$2:$B$22=B2,IF($C$2:$C$22=C2,$E$2:$E$22)))),0.95)}
Press CTRL+SHIFT+ENTER to enter array formulas.



and hide the result/sampname column....
 
Last edited:
Upvote 0
Oh wait, you want Access, hang on...............

Hi, hey this is a fantastic alternative than trying to get access to hammer this out. Many thanks!!!!! I think it may be easier to pull the data from access, work it out this way, then just suck the stats back into access as a new table.
Thanks again. Of course if you have a access module that does it that would be great as well but this is still a HUGE help. Thanks again.
 
Upvote 0
Of course if you have a access module that does it that would be great as well

I was a little hasty in implying so, it appeared much easier before some research:

From Percentile - Wikipedia, the free encyclopedia (scroll down to Excel's Algorithm, Access as you said has no percentile function):

the formula is in 2 parts:

1. (number of terms - 1)*percentile = integer part + decimal part

2. (integer part + 1)th term + decimal part*((integer part + 2)th term - (integer part + 1)th term)

So for your first set, {12,44}:

1. (2-1)*.95 = 0 + .95

2. 12 + .95(44 - 12) = 12 + 30.4 = 42.4

Since Index, like percentile, is another Excel function that Access doesn't have, you'll probably need VBA to determine the nth position within the group arrays (I don't know if SQL alone can do this, at least in Access JET).

Short answer: I'm glad you didn't say this has to be done in Access
 
Upvote 0
I was a little hasty in implying so, it appeared much easier before some research:

From Percentile - Wikipedia, the free encyclopedia (scroll down to Excel's Algorithm, Access as you said has no percentile function):

the formula is in 2 parts:

1. (number of terms - 1)*percentile = integer part + decimal part

2. (integer part + 1)th term + decimal part*((integer part + 2)th term - (integer part + 1)th term)

So for your first set, {12,44}:

1. (2-1)*.95 = 0 + .95

2. 12 + .95(44 - 12) = 12 + 30.4 = 42.4

Since Index, like percentile, is another Excel function that Access doesn't have, you'll probably need VBA to determine the nth position within the group arrays (I don't know if SQL alone can do this, at least in Access JET).

Short answer: I'm glad you didn't say this has to be done in Access

Yes I went through this math last night and quickly realized why 99% of the access functions posted for this produce erroneous data. I think using access to do this is like using a rock to pound in a nail when you have a nice hammer sitting right next to it.
I found one add in that worked, but it was a trial version and I could envision it churning away on a large data set for hours when I need the data in 10 minutes.
Again, your solution enabled us to leave on time and get some sleep tonight rather than hand grouping thousands of records and performing stats on the individual groups. I cant thank you enough! It validated at 100% too! Pats sheetspread on back and says thanks!
 
Upvote 0
Confused, you want the 95% of each LOCATION? Wouldnt that be
LOC1 = 13 records
LOC2 = 6 records
?
It doesnt seem to be RESULT value.

No that is where the grouping becomes necessary to doing the statistics properly in one shot. To calc the stats correctly these three groups would have four different stat calcs One for each ana.
Loc 1 wet = 4 ana
Loc 1 Dry = 4 ana
Loc 2 Wet = 4 ana

Thanks
 
Upvote 0

Forum statistics

Threads
1,214,975
Messages
6,122,538
Members
449,088
Latest member
RandomExceller01

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top