Help with simplification of IF/OR formula

dimexi

Board Regular
Joined
May 21, 2012
Messages
108
Hi All, I was wondering if I could get some help/advice.

I have two tables illustrated below:

Table 1

X values Y Values

1 100

2 200

3 300

4 400

5 500

6 600

7 700



Table 2

Entered X value Linearly interpolated Y

2.56

3.72

4.69

5.82



Now, table 2 is a user input, and my problem is getting a simplistic formula that can look up the entered X value in table 1, if it is not exact, linearly interpolate ONLY between 2 points and return the answer. For example, for 2.56, I would like the formula to go to table 1 and linearly interpolate between 2 and 3 and give the Y value.

I know I can achieve this with nested If/Or statements, that is, I could use =IF(OR(G11>B3,G11<B4,G11=B3,G11=B4),FORECAST(G11,C3:C4,B3:B4),"no") and in place of “no”, I would nest another If and OR to look at the next two pairs. The problem is there is a limit of 7 nested Ifs.

Can anyone help? I suspect the answer is an array formula, but I am not the best at those.



Hope it is clear, and Thank you
 

Excel Facts

Fastest way to copy a worksheet?
Hold down the Ctrl key while dragging tab for Sheet1 to the right. Excel will make a copy of the worksheet.
@dimexi, thanks for the feedback. Of the formula offerings, I think Jack's approach is the preferred one too. I'm glad you got something that works well for your application.

You expressed interest in the AGGREGATE function, so I'll offer some explanation of its use here because it might be handy to use for other purposes in the future.

Given two known pairs of (x,y) data...let's say (x1,y1) and (x3,y3), you'd like an interpolated value of y2 for some input of x2, where x1<=x2<=x3. The traditional point-slope formula can be derived by considering that all three points would lie on the same line, therefore any line segments connecting two of the points would have equal slopes, leading to an expression like this:
(y3-y1)/(x3-x1) = (y2-y1)/(x2-x1)
Then solving for y2, we get ===> y2 = (y3-y1)(x2-x1)/(x3-x1) + y1
My suggestion uses this explicit formula. Jack's and your original idea both make use of the more compact built-in FORECAST function, which essentially reduces to this same formula internally if only two pairs of (x,y) points are specified.

A common approach to finding a value in a list is to use INDEX/MATCH, which is found in Jack's solution: INDEX($B$3:$C$9, MATCH(G11,$B$3:$B$9,1)
The MATCH function uses the input value in cell G11 (treated as x2 in this example) and the "1" option will find the position of the largest x value in B3:B9 that is less than or equal to it. My implementation of AGGREGATE does something similar. Consider AGGREGATE(14,6,(ROW($A$2:$A$8)-ROW($A$1))/($A$2:$A$8<=E2),1) as an example:
ROW($A$2:$A$8) establishes an array of row numbers where the x-y values are found. This is equivalent to {2;3;4;5;6;7;8}, and since the idea is to use this construction as a list index, the starting position of the list must be taken into account by subtracting the row number just above the list, so (ROW($A$2:$A$8)-ROW($A$1)) does nothing more than create an array of {1;2;3;4;5;6;7}. This "index" array is divided by the array ($A$2:$A$8<=E2), where the logical comparison will create an array consisting of TRUEs and FALSEs, depending on whether the known x values are less than or equal to the input x2 value in cell E2 (in this case). The division operation will coerce the TRUEs and FALSEs to be treated as 1's and 0's, and the result will be an array that might look like this: {1;2;3;4;#DIV/0!;#DIV/0!;#DIV/0!}, where FALSE's converted to 0's give a division-by-zero error. That's okay, since the AGGREGATE function has invoked the "6" option which means that errors will be ignored. The correct interpretation of this hypothetical resultant array is that the first four items in the list are less than or equal to x2. We want the largest item in this list, and I assumed that the values would be monotonically increasing, so the 4th x-value would be taken as x1 in the equation above, and this would establish the beginning of the x range to use for the interpolation. In order to specify that the largest array value should be returned, the 14 option is used in AGGREGATE, which will return the nth largest value in the array ("n" is given by the last argument...in this case a 1). So AGGREGATE(14,6,{array},1) returns the largest array value while ignoring any error terms in the array. In some applications, this feature is very handy---to intentionally generate errors in the array to exclude certain values and then rely on the "ignore errors" option to prevent the function from crashing with an error message. Also, in some applications, you might want the 2nd largest value, so the number of the last argument would be changed to a 2, and so on, as is done with the LARGE function. Since the array values are row numbers converted to act as list indices, they work well within the INDEX function to return the desired x and y values for the interpolation. In cases where the next list item should be returned, I've added a 1 to the index value given by AGGREGATE. The biggest issue with my approach is that each x and y value is extracted from the table using a formula resembling this INDEX/AGGREGATE construction, which adds a lot of redundancy and doesn't take advantage of the more direct route to specify the ranges of interest. Incidentally, another common use of AGGREGATE is to return the nth smallest value in an array, which is done using 15 as the first argument.

One thing that I would caution with any of these approaches is to confirm that your x values are in ascending order. If not, then some adjustment to the formulas would be necessary.
 
Upvote 0
Nothing wrong with the formula that JD has already provided for you, this is purely because I have a dislike for volatile functions (OFFSET) and try to avoid them where possible,

=FORECAST(G11,INDEX($C$3:$C$9,N(IF({1},MATCH(G11,$B$3:$B$9)+{0,1}))),INDEX($B$3:$B$9,N(IF({1},MATCH(G11,$B$3:$B$9)+{0,1}))))
 
Upvote 0
Solution
Jason, I like this suggestion as an alternative to the volatile OFFSET. I'm curious about why the IF function is wrapped with an N function. If I look into the formula's execution (either Evaluate Formula or select a portion and hit F9), I see that the IF/MATCH components generates the desired array, but without the N wrapper, that array doesn't play well with INDEX. The N function returns the same values in an array (which is expected), but the N-generated array does play well with INDEX. Do you know why this is the case, or am I missing something? I would appreciate any insight.
 
Upvote 0
In all honesty, Kirk, I have absolutely no idea how or why it works, only that it does.

It was a method suggested somewhere to shorten an otherwise long formula, something like =SUM(VLOOKUP(N(IF{1},A1:A10)),...)) as opposed to =VLOOKUP(A1,...)+VLOOKUP(A2,...)+... I seem to recall that there was quite a lengthy discussion on it but nothing definitive. The only thing that I know for sure is that N(IF({1},...)) or T(IF({1},...)) allow you to pass arrays to some function arguments where they would not normally be accepted.

The first argument of a lookup function or the row / column of index are the most common uses for this but there are others, probably some that are yet to be discovered. I've used it in a couple of other things previously, tried searching my past replies for IF({1} but the forum search appears to ignore symbols and numbers. If I do find any other uses I'll forward the links to you.

edit:- This is one of the other threads that I was looking for

 
Last edited:
Upvote 0
Agree with @jasonb75 about avoiding volatile functions, as with most Excel questions, trade off between ease and "getting it done" vs doing it "properly".

For this, the OFFSET just makes it easier to see what is being picked up and passed as arguments to the FORECAST function, based on the calc'd matched rows - as you can see lots of ways to derive an answer.
 
Upvote 0
trade off between ease and "getting it done" vs doing it "properly"
Two factors that unfortunately both rely on the accuracy and consistency of the examples provided in the original question. Going by post 1, I would consider FORECAST(G11,$C$3:$C$4,$B$3:$B$4) to be the 'correct' answer.

The post shows that both the known x's and y's are on an ascending linear scale, which (I believe, please correct me if I'm wrong) means that the correct results could be obtained from any 2 of the known pairs, with the rest of the known values and a large part of the formulas that we suggested being superfluous :unsure:
 
Upvote 0
@KRice thank you so much for your well explained implementation of aggregate + the methodology behind your formula, it will definitely be useful for me in the future.

@jasonb75 Thank you for the less volatile suggestion. The 'N function' will definitely raise a lot of questions ;). Funny though re: the "correct answer" you refer to above, that is the formula I use as it stands. However, it is not dynamic enough and I have to change for every row.

FYIs, the x and ys in the original list are always in ascending order as this is how the data is generated. I wouldnt even bother with a list that wasn't in order. Also, to answer your final point @jasonb75 the values I provided in post 1 are dummy values!! I used them so testing any formula would be easy i.e. I would know what works and what doesn't since we all know what the interpolated value should be

Thanks guys for keeping the conversation going, I have found it very useful.
 
Upvote 0
trade off between ease and "getting it done" vs doing it "properly"
...a profound metaphor for much more than just Excel!

Jason, thanks for the commentary on the N function. It is interesting how it seems to condition an array so that it will be accepted. I'll have to keep this in mind when I run into situations where I know an array exists, but I can't process it further. I appreciate the tip!
 
Upvote 0
Nothing wrong with the formula that JD has already provided for you, this is purely because I have a dislike for volatile functions (OFFSET) and try to avoid them where possible,

=FORECAST(G11,INDEX($C$3:$C$9,N(IF({1},MATCH(G11,$B$3:$B$9)+{0,1}))),INDEX($B$3:$B$9,N(IF({1},MATCH(G11,$B$3:$B$9)+{0,1}))))
Hi again Jason, I have really been trying to understand this formula, but I am struggling. I have evaluated it like @KRice suggested, and I can see that each of the Index/If/Match sections provide the desired array, but I am stuck! My objective is usually to learn a formula in such a way that I understand every element and can troubleshoot/implement it in other applications etc.
The bit I dont get is the IF statement. What is the purpose of {1} as the logical test? what does that tell excel to do? Also the bit at the end of the MATCH function ({0,1})... When I isolate the IF function from the first bit of the formula, it equates to 2...
Summary: If you can, do you mind talking me through the logic? I am probably missing something
 
Upvote 0

Forum statistics

Threads
1,214,431
Messages
6,119,458
Members
448,899
Latest member
maplemeadows

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top