Breaking down the polynomial

duration17

New Member
Joined
Sep 26, 2006
Messages
9
Hello!

I am using INDEX() &LINEST() to get the coefficients for a quadratic polynomial regression (using =INDEX(LINEST(y:y,x:x^{1,2}),1,3) kind of formulas). I would like to understand every step of the calculation. Does anyone know how to break down the calculation, using SLOPE() and INTERCEPT() ?

Many thanks,
 

Excel Facts

Whats the difference between CONCAT and CONCATENATE?
The newer CONCAT function can reference a range of cells. =CONCATENATE(A1,A2,A3,A4,A5) becomes =CONCAT(A1:A5)
Hi duration17
Welcome to the board

According to the smiley, you don't seem to like the x^{1,2}.

Instead of a quadratic polynomial regression think of it as an OLS with 2 explaining variables. In that case you would get 2 slopes and the independent term, right?
Well, x^2 is your second explaining variable.

Now for the x^{1,2}

The function LINEST has two main parameters, the first is the Y values, and the second is a range with the values of all the explaining variables.

Assuming Y in A2:A6, with 2 explaining variables the Xs values could be B2:C6. That's what is in the example I post. I can use

In E4:
=LINEST(A2:A6,B2:C6)

However, the third column has, in fact, the square values of the second column because we are trying to explain Y with X and X^2.

It seems a bit dumb to write a third column with values that can be calculated from the second column. That's why we can use

In E3
=LINEST(A2:A6,B2:B6^{1,2})

The B2:B6^{1,2} means raise each value in B5:B6 first to the power of 1 and then to the power of 2. This calculates the values of B and C and so we don't need the column C.
This way we can calculate the OLS without writing the third column.

I'm sure this is not a good explanation but I hope it contributes to your better understanding of this OLS calculation.

Kind regards
PGC
Book1
ABCDEFGH
1YXX^2
2611m2m1b
31224132
42039132
530416
642525
7
8
Sheet1
 
Upvote 0
Hi pgc01,
Thank yo very much for reply, it is very insightful.
Do you have any idea how LINEST() calculates the slope and the intercept with two columns (x & x^2)?
Many thanks,
 
Upvote 0
Hi again

First one remark: ask one friend that did economics for his/her book of econometrics. This is usually a course in the first year at the university and explains in detail all about the OLS and its associated statistics.

The calculation of the parameters of the OLS is a simple calculus problem of finding a minimum. You take the partial derivatives of the coefficients and set them to zero. In the case of 2 explanatory variables you get a system with 3 linear equations.

For the case of 2 explanatory variables, you are trying to find the coefficients for an equation

Z = aX + bY +c

that minimise the sum of the squares of the differences between the estimated values using the equation and the real value in the sample. For one point the difference is

a*x_i + b*y_i + c - z_i

The sum of the squares of the differences:

SUM ( a*x_i + b*y_i + c - z_i )^2

now to find the minimum you take the partial derivatives to a, b and c and set them to zero. For example the case of "a"

SUM 2 * x_i * ( a*x_i + b*y_i + c - z_i ) = 0

or

SUM(x_i^2) * a + SUM( x_i * y_i) * b + SUM(x_i) * c = SUM(x_i * z_i)

Remember that we have all the x_i, y_i and z_i in our sample and so this is a simple equation like

10 * a + 23 * y + 14 * c = 34

We do the same for the other 2 coefficients and we have a system of 3 linear equations. From there we take the values of a, b and c.

In the next web page you have this in the section "Multiple linear regression". I think it's worth that you read the whole page

http://en.wikipedia.org/wiki/Multiple_regression

Hope this helps you understanding the mechanics of the calculation od the OLS coefficients
PGC
 
Upvote 0
PGC,

Thank you very much for your kind reply, you are very knowledgeable!
I have a few friends who did some econometrics but they have forgotten everything :(
I am going through the article from wikipedia, and could already calculate a simple linear regression manuall in excel.
Just a small question: what do "z" and "i" stand for?
 
Upvote 0
Hi again

I did a (dumb) example where you would try to explain the size of the jeans given the height (cm) and the weight (Kg) of a man.

In my last post X and Y are the explanatory variables, in this case height and weight. Z was the variable you were trying to explain, in this case the size of the jeans. "i" was the order of the triplet in the data sample. In the case of this example, for instance, using x, y and z of my last post for i=4

x_4 = 180
y_4 = 85
z_4 = 35

and using LINEST(A2:A11,B2:C11) we find

Z = -0.16 * X + 0.22 * Y + 45.2

In column D I calculated the values estimated and if this was a real example it would not be that bad (of course we haven't looked at the stats).

Hope this carifies some of the last post.
PGC
Book1
ABCDEFGHI
1SizeHeightWeightEstsize
2321807031.8Linest
3331807532.90.22-0.1645.2
4341808034.0
5351808535.1
6361809036.2
7311857031.0
8321857532.1
9331858033.2
10341858534.3
11361859035.4
12
13
Sheet5
 
Upvote 0
PGC,

I understand much better thanks to your explanations, thank you very much. Just one question: how do you get the -0.16, 0.22 and 45.2 values?

Tusharm,
Thank you for the website, I am reading and it is very well done!
 
Upvote 0
Hi again

If you mean how would you do it by hand, without LINEST, it would require some work. In this case, however, since there are only 2 explanatory variables I can show you how, with the values in the example in my last post.

I had derived the first of the 3 equations, by calculating the partial derivative with respect to "a".

SUM(x_i^2) * a + SUM( x_i * y_i) * b + SUM(x_i) * c = SUM(x_i * z_i)

You have to do the same thing for the other 2 coefficients and you get the other 2 equations.

You have already seen them in the "Multiple linear regression" section in http://en.wikipedia.org/wiki/Multiple_regression

They are

SUM( x_i * y_i) * a + SUM(y_i^2) * b + SUM(y_i) * c = SUM(y_i * z_i)

SUM(x_i) * a + SUM( y_i) * b + n = SUM(z_i)

Now we have to calculate all this sums. I post the calculations. You can see that they are very simple (but boring).

Substituting the values:

333125 * a + 146000 * b + 1825 * c = 61310
146000 * a + 64500 * b + 800 * c = 26990
1825 * a + 800 * b + 10 * c = 336

Solving the system:
a = -0.16
b = 0.22
c = 45.2

As we already knew from LINEST.

I hope this clarifies the calculations.
PGC
 
Upvote 0

Forum statistics

Threads
1,214,631
Messages
6,120,645
Members
448,974
Latest member
DumbFinanceBro

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top