Ranks for grouped data within a population


Posted by Gary Fogelbach on November 10, 2001 1:59 PM

I am having some difficulty determining the Rank for groups of data within a database. I have a 3 column database. First column is job, second is Avg Salary, third is respondent. There are 5 Avg Salaries reported for each job by 5 different respondents to a survey. There are 6 different jobs. This is a simplified version of a huge database. I need to determine the rank of each respondent for each job. For example, for Job #1, the average salary of Respondent A may Rank First, for Job #2 Respondent A may rank 5th, etc. I tried using a CSE formula with RANK combined with an IF function, but Excel wouldn't accept the formula. I would be very grateful for any suggestions. Thanks so much.

Posted by Aladin Akyurek on November 10, 2001 2:19 PM

Gary,

Care to post 10 rows of your data along with the expected results?

Aladin

=======

Posted by Gary F on November 10, 2001 2:45 PM

Here are 10 Rows, Aladin. Rank is the expected result. Thanks for your help.

Job# Avg Resp Rank
1101 43812 A 4
1101 49656 B 1
1101 40737 C 5
1101 46555 D 2
1101 45312 E 3
1102 20646 A 5
1102 22172 B 4
1102 24123 C 2
1102 23188 D 3
1102 27615 E 1

Posted by Mark W. on November 10, 2001 2:47 PM

You really didn't specify the basis for the ranking.
Would the lowest salary in each job qualify be
ranked 1st? I had to assume something so that
was my assumption. Using the following data
set (based on your description) in cells A1:C31...

{"Job","Avg Salary","Responent"
;1,71835,"A"
;1,82682,"B"
;1,76912,"C"
;1,58490,"D"
;1,93316,"E"
;2,67710,"A"
;2,94963,"B"
;2,83441,"C"
;2,80485,"D"
;2,57172,"E"
;3,52355,"A"
;3,96509,"B"
;3,67471,"C"
;3,93955,"D"
;3,89184,"E"
;4,80027,"A"
;4,56587,"B"
;4,52107,"C"
;4,55348,"D"
;4,84386,"E"
;5,68113,"A"
;5,96537,"B"
;5,97722,"C"
;5,86961,"D"
;5,55017,"E"
;6,89544,"A"
;6,70401,"B"
;6,69269,"C"
;6,52326,"D"
;6,61038,"E"}

I used the array formula...

{=SUM((IF(A2=Job,'Avg Salary')<B2)+0)+1}

...to get...

{"Job","Avg Salary","Responent","Rank"
;1,71835,"A",2;1,82682,"B",4
;1,76912,"C",3;1,58490,"D",1
;1,93316,"E",5;2,67710,"A",2
;2,94963,"B",5;2,83441,"C",4
;2,80485,"D",3;2,57172,"E",1
;3,52355,"A",1;3,96509,"B",5
;3,67471,"C",2;3,93955,"D",4
;3,89184,"E",3;4,80027,"A",4
;4,56587,"B",3;4,52107,"C",1
;4,55348,"D",2;4,84386,"E",5
;5,68113,"A",2;5,96537,"B",4
;5,97722,"C",5;5,86961,"D",3
;5,55017,"E",1;6,89544,"A",5
;6,70401,"B",4;6,69269,"C",3
;6,52326,"D",1;6,61038,"E",2}

Posted by Mark W. on November 10, 2001 2:49 PM

Repost to Fix some Typo's

: I am having some difficulty determining the Rank for groups of data within a database. I have a 3 column database. First column is job, second is Avg Salary, third is respondent. There are 5 Avg Salaries reported for each job by 5 different respondents to a survey. There are 6 different jobs. This is a simplified version of a huge database. I need to determine the rank of each respondent for each job. For example, for Job #1, the average salary of Respondent A may Rank First, for Job #2 Respondent A may rank 5th, etc. I tried using a CSE formula with RANK combined with an IF function, but Excel wouldn't accept the formula. I would be very grateful for any suggestions. Thanks so much.

Posted by Mark W. on November 10, 2001 2:53 PM

Okay, look like 1st goes to the highest... Use this...

{=SUM((IF(A2='Job#','Avg')+0 > B2)+0)+1}

...where your data resides in columns A:C. 1101 43812 A 4 1101 49656 B 1 1101 40737 C 5 1101 46555 D 2 1101 45312 E 3 1102 20646 A 5 1102 22172 B 4 1102 24123 C 2 1102 23188 D 3 1102 27615 E 1

: Gary,

Posted by Mark W. on November 10, 2001 2:55 PM

I should mention...

...that 'Job#' and 'Avg' are named ranges.
You can use the cell ranges themselves if
you wish. : 1101 43812 A 4 : 1101 49656 B 1 : 1101 40737 C 5 : 1101 46555 D 2 : 1101 45312 E 3 : 1102 20646 A 5 : 1102 22172 B 4 : 1102 24123 C 2 : 1102 23188 D 3 : 1102 27615 E 1 :

Posted by Mark W. on November 10, 2001 2:58 PM

One my typo fix...

:

Posted by Gary F. on November 10, 2001 3:14 PM

Re: I should mention...

Thanks for the quick response, Mark! I'll give it a shot. ...that 'Job#' and 'Avg' are named ranges. You can use the cell ranges themselves if you wish. : {=SUM((IF(A2='Job#','Avg')+0 > B2)+0)+1} : ...where your data resides in columns A:C.

Posted by Aladin Akyurek on November 10, 2001 3:37 PM

Gary --

Here is a different approach. I'll assume your sample data to be in A1:C11 including labels.

In D2 enter: =SUMPRODUCT(MATCH(B2, LARGE(INDIRECT(ADDRESS(MATCH(A2,$A:$A,0),2) & ":" & ADDRESS(MATCH(A2,$A:$A,0)+4,2)), {1,2,3,4,5}),0))

Copy this down as far as needed.

If any questions, let me know.

Aladin

========== Job# Avg Resp Rank 1101 43812 A 4 1101 49656 B 1 1101 40737 C 5 1101 46555 D 2 1101 45312 E 3 1102 20646 A 5 1102 22172 B 4 1102 24123 C 2 1102 23188 D 3 1102 27615 E 1

Posted by Gary F on November 10, 2001 3:39 PM

Re: I should mention...

Hey Mark: Would you mind explaining a bit about how that formula works, please? I can't seem to get it to return the rank for each job. Thanks. ...that 'Job#' and 'Avg' are named ranges. You can use the cell ranges themselves if you wish. : {=SUM((IF(A2='Job#','Avg')+0 > B2)+0)+1} : ...where your data resides in columns A:C.


Posted by Gary on November 10, 2001 3:51 PM

Using your data set in cells A1:C11...

1. Enter the array formula...
{=SUM((IF(A2=$A$2:$A$11,$B$2:$B$11)+0 > B2)+0)+1}
into cell D2 as an array formula.

Note: This is an array formula which must be
entered using the Control+Shift+Enter key
combination. The outermost braces, {}, are not
entered by you -- they're supplied by Excel in
recognition of a properly entered array formula.

2. Copy the formula in cell D2 down to cell D11

This should produce the results you want!!!

Hey Mark: Would you mind explaining a bit about how that formula works, please? I can't seem to get it to return the rank for each job. Thanks. : ...that 'Job#' and 'Avg' are named ranges. : You can use the cell ranges themselves if : you wish.


Posted by Mark W. on November 10, 2001 3:51 PM

Re: I should mention...

Using your data set in cells A1:C11...

1. Enter the array formula...
{=SUM((IF(A2=$A$2:$A$11,$B$2:$B$11)+0 > B2)+0)+1}
into cell D2 as an array formula.

Note: This is an array formula which must be
entered using the Control+Shift+Enter key
combination. The outermost braces, {}, are not
entered by you -- they're supplied by Excel in
recognition of a properly entered array formula.

2. Copy the formula in cell D2 down to cell D11

This should produce the results you want!!!

Hey Mark: Would you mind explaining a bit about how that formula works, please? I can't seem to get it to return the rank for each job. Thanks. : ...that 'Job#' and 'Avg' are named ranges. : You can use the cell ranges themselves if : you wish.


Posted by Mark W. on November 10, 2001 3:54 PM

Re: I should mention...

Using your data set in cells A1:C11...

1. Enter the array formula...
{=SUM((IF(A2=$A$2:$A$11,$B$2:$B$11)+0 > B2)+0)+1}
into cell D2 as an array formula.

Note: This is an array formula which must be
entered using the Control+Shift+Enter key
combination. The outermost braces, {}, are not
entered by you -- they're supplied by Excel in
recognition of a properly entered array formula.

2. Copy the formula in cell D2 down to cell D11

This should produce the results you want!!!

Hey Mark: Would you mind explaining a bit about how that formula works, please? I can't seem to get it to return the rank for each job. Thanks. : ...that 'Job#' and 'Avg' are named ranges. : You can use the cell ranges themselves if : you wish.


Posted by Mark W. on November 10, 2001 4:02 PM

Aladin, this is my posting -- not Gary's! Sorry for the confusion. (nt)

1. Enter the array formula... {=SUM((IF(A2=$A$2:$A$11,$B$2:$B$11)+0 > B2)+0)+1} into cell D2 as an array formula. Note: This is an array formula which must be entered using the Control+Shift+Enter key combination. The outermost braces, {}, are not entered by you -- they're supplied by Excel in recognition of a properly entered array formula. 2. Copy the formula in cell D2 down to cell D11 This should produce the results you want!!!


Posted by Mark W. on November 10, 2001 4:02 PM

Aladin, this is my posting -- not Gary's! Sorry for the confusion. (nt)

1. Enter the array formula... {=SUM((IF(A2=$A$2:$A$11,$B$2:$B$11)+0 > B2)+0)+1} into cell D2 as an array formula. Note: This is an array formula which must be entered using the Control+Shift+Enter key combination. The outermost braces, {}, are not entered by you -- they're supplied by Excel in recognition of a properly entered array formula. 2. Copy the formula in cell D2 down to cell D11 This should produce the results you want!!!


Posted by Mark W. on November 10, 2001 4:23 PM

This approach fails if the list isn't sorted by
by Job# or if there are not 5 respondents for
each Job#. Gary -- Here is a different approach. I'll assume your sample data to be in A1:C11 including labels. In D2 enter: =SUMPRODUCT(MATCH(B2, LARGE(INDIRECT(ADDRESS(MATCH(A2,$A:$A,0),2) & ":" & ADDRESS(MATCH(A2,$A:$A,0)+4,2)), {1,2,3,4,5}),0)) Copy this down as far as needed. If any questions, let me know. === : Job# Avg Resp Rank : 1101 43812 A 4 : 1101 49656 B 1 : 1101 40737 C 5 : 1101 46555 D 2 : 1101 45312 E 3 : 1102 20646 A 5 : 1102 22172 B 4 : 1102 24123 C 2 : 1102 23188 D 3 : 1102 27615 E 1 :

Posted by Aladin Akyurek on November 10, 2001 4:27 PM

That's right. Omitted the caveat, because I expect the data to be sorted (as survey data usually are) by Job#. If not, it should be sorted by Job#.

========= This approach fails if the list isn't sorted by by Job# or if there are not 5 respondents for each Job#. : Gary -- : Here is a different approach. I'll assume your sample data to be in A1:C11 including labels. : In D2 enter: =SUMPRODUCT(MATCH(B2, LARGE(INDIRECT(ADDRESS(MATCH(A2,$A:$A,0),2) & ":" & ADDRESS(MATCH(A2,$A:$A,0)+4,2)), {1,2,3,4,5}),0)) : Copy this down as far as needed. : If any questions, let me know. ===



Posted by Aladin Akyurek on November 11, 2001 7:50 AM

Gary & Mark: Another proposal with RANK

I'll assume Gary's sample data in A1:C11 including labels (Job#, etc), that is,

{"Job#","Avg","Resp";1101, 43812,"A";1101, 49656,"B";1101, 40737,"C";1101, 46555,"D";1101, 45312,"E";1102, 20646,"A";1102, 22172,"B";1102, 24123,"C";1102,23188, "D";1102,27615, "E"}

Make a list of unique Job# in column D from D2 on. This can be done e.g., by means of Advanced Filter.

Make a list of Respondents in the top row, that is, from G1 to K1.

In E2 enter: =ADDRESS(MATCH(D2,$A:$A,0),2) & ":" & ADDRESS(MATCH(D2,$A:$A,0)+4,2)

Copy down this as far as needed.

In F2 enter: =ADDRESS(MATCH(D2,$A:$A,0),3) & ":" & ADDRESS(MATCH(D2,$A:$A,0)+4,3)

Copy down this as far as needed.

These 2 formulas require that the data have been sorted on Job#.

Finally:

In G2 enter: =RANK(INDEX(INDIRECT($E2), MATCH(G$1, INDIRECT($F2),0)), INDIRECT($E2))

Copy this across then down.

The result section (D1:K3) will show the following:

{"","","","A","B","C","D","E";
1101,"$B$2:$B$6","$C$2:$C$6",4,1,5,2,3;
1102,"$B$7:$B$11","$C$7:$C$11",5,4,2,3,1}

In fact a summary table that pulls everything together.

The array formula (by Mark) is better than the SUMPRODUCT formula (by me) on two counts: shorter and I believe more efficient. (Although the array formula doesn't require sorting, results based on unsorted data would be a bit harder to process.)

The above system, however, should be a bit more efficient in terms of very large sets of data and the resulting "summary" table is more accessible to Excel and to us.

Aladin

=========