Finding nth occurrence using SMALL & SUMPRODUCT

rossa

Board Regular
Joined
Nov 7, 2013
Messages
109
As the title says I need to do a nth value/occurrence look up. I was able to find a maximum value by using MAX in conjunction with SUMPRODUCT and thought I could apply the same logic by using SMALL along with SUMPRODUCT.
My test sheet is set up as follows:
A3:A17 contains letters which are A, B or C, the range I need to look up is in H3:I17 (first five cells in H contain A, next five B and final five C; I contains random numbers I wish to return).
Cell A3 contains A and as it is the first occurrence, in cell B3 I wish to return the value in cell I3 (A7 contains the second occurrence of A and I would want the value from cell I4 in B4).


My formula is as follows “SUMPRODUCT(SMALL(--($H$1:$H$27=A3)*ROW($I$1:$I$27),2))”
My logic is that the part “--($H$1:$H$27=A3)*ROW($I$1:$I$27)“ will return the row numbers that “A” appears in and I should then be able to use the SMALL function to return what occurrence I want by replacing the 2 with a COUNTIF. However the SMALL function does not seem to be working for me and is just returning 0.
Any help is appreciated.
 

Excel Facts

Select all contiguous cells
Pressing Ctrl+* (asterisk) will select the "current region" - all contiguous cells in all directions.
Control+shift+enter, not just enter, and copy down:

=SMALL(IF(ConditionRange1=condition1,IF(ConditionRange2=condition2,NumRange)),ROWS($1:1))
 
Upvote 0
The problem with your formula is that every row that does not match "A" generates a 0, so you'll have 10 0s, plus 5 row numbers where the A appears. So you'll need to add 10 to your COUNTIF to get the row you really want. Or you could do something like this:

ABHIJ
3A1A1
4B6A2
5C11A3
6C12A4
7A2A5
8B6
9B7
10B8
11B9
12B10
13C11
14C12
15C13
16C14
17C15

<tbody>
</tbody>
Sheet7

Worksheet Formulas
CellFormula
B3=INDEX($I$3:$I$17,AGGREGATE(15,6,ROW($I$3:$I$17)/($H$3:$H$17=A3)-ROW($I$3)+1,COUNTIF($A$3:$A3,A3)))

<tbody>
</tbody>

<tbody>
</tbody>
 
Upvote 0
Try the following array formula (to be entered using Ctrl+Shift+Enter, not just Enter):

=SMALL(IF($H$1:$H$27=A3,ROW($H$1:$H$27)),2)
 
Upvote 0
Hi,

This is a good example of why conditional statements are best construed via recursive IF statements than via multiplication, the reason being that the former results in an array containing Boolean FALSEs, which are ignored by many functions (including SMALL), whereas the latter results in an array containing zeroes (which, contrary to being ignored, are processed as valid numerical entries).

To clarify:

($H$1:$H$27=A3)*ROW($I$1:$I$27)

might resolve to (assuming some random entries in H1:H27):

({FALSE;FALSE;TRUE;TRUE;TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE})*ROW($I$1:$I$27)

but then, in the next step of the evaluation, the Booleans in the above are coerced (via the operation of multiplication) into their numerical equivalents (TRUE=1, FALSE=0), so that the above is equivalent to:

{0;0;1;1;1;1;1;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0}*ROW($I$1:$I$27)

i.e.:

{0;0;3;4;5;6;7;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0}

and of course those zeroes will be validly considered by SMALL.

This is not the case with the alternative:

IF($H$1:$H$27=A3,ROW($I$1:$I$27))

(which will now necessitate CSE), since this resolves to:

IF({FALSE;FALSE;TRUE;TRUE;TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE},ROW($I$1:$I$27))

i.e.:

{FALSE;FALSE;3;4;5;6;7;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE}

and this time the zeroes have been replaced with Boolean FALSEs, which SMALL and others will happily ignore.

Regards
 
Upvote 0
XOR LX presented an excellent explanation of why the original formula did not work. To expound a bit on my comment from post 3, based on your description, you know exactly how many As, Bs, and Cs there are in your H3:H17 range, 5 of each. So you know exactly how many zeros will be generated (10), and can just add the 10 to the k value in the SMALL function. If you don't know how many zeros there will be, if you don't know exactly what values are in that column, you can find out by using

COUNTIF($H$3:$H$17,"<>"&A3)

and add that.

I assumed you wanted to avoid CSE since you were using SUMPRODUCT, which is why I suggested the AGGREGATE function. That also has array processing, and can ignore errors (much as the array formulas ignore Boolean values). But the formula is a bit opaque to being easily understood. But the formulas suggested by the other posters are a bit shorter, and easier to understand, even if they do use CSE.
 
Upvote 0
Thank you all, I was hoping to avoid using CSE.
XOR LX, I'm sorry but I'm not understanding the bit in bold below. If what you're saying is correct, then shouldn't my SMALL function work?


Hi,

This is a good example of why conditional statements are best construed via recursive IF statements than via multiplication, the reason being that the former results in an array containing Boolean FALSEs, which are ignored by many functions (including SMALL), whereas the latter results in an array containing zeroes (which, contrary to being ignored, are processed as valid numerical entries).

To clarify:

($H$1:$H$27=A3)*ROW($I$1:$I$27)

might resolve to (assuming some random entries in H1:H27):

({FALSE;FALSE;TRUE;TRUE;TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE})*ROW($I$1:$I$27)

but then, in the next step of the evaluation, the Booleans in the above are coerced (via the operation of multiplication) into their numerical equivalents (TRUE=1, FALSE=0), so that the above is equivalent to:

{0;0;1;1;1;1;1;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0}*ROW($I$1:$I$27)

i.e.:

{0;0;3;4;5;6;7;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0}

and of course those zeroes will be validly considered by SMALL.

This is not the case with the alternative:

IF($H$1:$H$27=A3,ROW($I$1:$I$27))

(which will now necessitate CSE), since this resolves to:

IF({FALSE;FALSE;TRUE;TRUE;TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE},ROW($I$1:$I$27))

i.e.:

{FALSE;FALSE;3;4;5;6;7;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE}

and this time the zeroes have been replaced with Boolean FALSEs, which SMALL and others will happily ignore.

Regards
 
Upvote 0
Your array ends up looking like the fourth blue line from that post:

{0;0;3;4;5;6;7;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0}

You may wish to use the Evaluate Formula tool on the Formulas tab to see that is so. A common misconception about SMALL is that it ignores duplicates, which is not so. Applying SMALL(array,2) to that array does not return 3. It returns 0. SMALL(array,3) returns 0. SMALL(array,4) returns 0. All the way to SMALL(array,22) returns 0. Then SMALL(array,23) finally returns 3. This is because there are 22 zeros in that array.

There have been 3 ways mentioned so far in this thread to get around that. One way is to find out how many zeros are in the array (22), then use SMALL(array,2+22) to get the second smallest non-zero value.

Next was to use a CSE formula, which makes the array look like the last blue line in that post:

{FALSE;FALSE;3;4;5;6;7;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE}

and SMALL ignores the non-numeric values, to wit, all the FALSEs.

Finally, my formula from post 3 replaces the zeros or FALSEs with errors, and uses AGGREGATE which can ignore errors.

Make any more sense?
 
Upvote 0
...My test sheet is set up as follows:
A3:A17 contains letters which are A, B or C, the range I need to look up is in H3:I17 (first five cells in H contain A, next five B and final five C; I contains random numbers I wish to return).
Cell A3 contains A and as it is the first occurrence, in cell B3 I wish to return the value in cell I3 (A7 contains the second occurrence of A and I would want the value from cell I4 in B7)...
For the problem as described, you don't need SMALL, SUMPRODUCT, or CSE. See if the following formula works for you:

=INDEX($I$3:$I$17,MATCH(A3,$H$3:$H$17,0)-1+COUNTIF($A$3:A3,A3))
 
Last edited:
Upvote 0
Your array ends up looking like the fourth blue line from that post:

{0;0;3;4;5;6;7;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0}

You may wish to use the Evaluate Formula tool on the Formulas tab to see that is so. A common misconception about SMALL is that it ignores duplicates, which is not so. Applying SMALL(array,2) to that array does not return 3. It returns 0. SMALL(array,3) returns 0. SMALL(array,4) returns 0. All the way to SMALL(array,22) returns 0. Then SMALL(array,23) finally returns 3. This is because there are 22 zeros in that array.

There have been 3 ways mentioned so far in this thread to get around that. One way is to find out how many zeros are in the array (22), then use SMALL(array,2+22) to get the second smallest non-zero value.

Next was to use a CSE formula, which makes the array look like the last blue line in that post:

{FALSE;FALSE;3;4;5;6;7;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE}

and SMALL ignores the non-numeric values, to wit, all the FALSEs.

Finally, my formula from post 3 replaces the zeros or FALSEs with errors, and uses AGGREGATE which can ignore errors.

Make any more sense?

Yes, got it now. Thaks.
 
Upvote 0

Forum statistics

Threads
1,214,643
Messages
6,120,702
Members
448,980
Latest member
CarlosWin

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top