Curvilinear correlation

Jim G · Oct 2, 2002

When you perform linear regression in Excel, you have the option of adding a trendline to the chart. You can also add the linear equation and r square.

Reclaculation of the r square amount via the RSQ function yields the same number as displayed on the graph.

The same options are available for curvilinear regression (i.e. exponential, logrythmic, power, etc.).

However, the displayed r square value cannot be calculated via the RSQ function for the curvilinear regression. In fact, r square analysis is only applicable to linear regression.

My question is this: How is Excel calculating the r square value (on the trendline graph) for the curvilinear data and what does this r square value represent?

There are various methodologies for computing the equivalent to an r square value, i.e. correlation ratio or ETA coefficient. However, these are not an r square computation.

Damon Ostrander · Oct 3, 2002

Hi Jim,

The method of calculating the r^2 value is described in the Excel helps under the topic heading "Equations for calculating trendlines." What the description there doesn't mention is what SSE and SST mean. The SSE is the sum of the squared residuals between the line and the data values, and this is the value that the least squares methodology is minimizing. The SST value is the variance of the data values. As a result of this definition the r-value is a measure of how well the line was able to fit the data. A value of 1 indicates a perfect fit, meaning that all the data points lie exactly on the line. A value of 0 means a "perfectly awful" fit.

For the linear case the r value is the same as the Pearson product moment correlation coefficient, the value returned by the Excel RSQ function, but this is only true for the first-order (linear) case, not in the general (curvilinear) case. The RSQ r value is defined as

r = cov(x,y)/(sx sy)

where cov is the covariance, sx and sy are the standard deviations of x and y.

The nice thing about the 1 - SSE/SST definition of r is that this value will continue to provide a "goodness of fit" measure even in the curvilinear case so that you can tell if adding more terms to the equation, or using different types of fits produces a quantitively better fit. One caveat regarding this though: Excel uses a "transformed regression model" for types of fits other than the power series polynomial fit. As a result the least squares solution will not be the true minimum variance solution but will for example be the minimum of the squares of the logs of the residuals if a log fit is being done. This of course means also that the r-value will be skewed by the transformation as well, but the value should still lie between 0 and 1. But you cannot assume that an r-value of 0.9 for a log fit represents a better fit than an r-value of 0.8 for a power series fit.

I hope this helps.

IML · Oct 3, 2002

Damon,
Do you know of a good statical site for a novice who needs to brush up? Tom Urtis was kind enough to provide me one the other day, but the more the better.

PS - didn't realize you were a fellow mile-higher. Is this a new development?

Damon Ostrander · Oct 3, 2002

Hi IML,

I am not aware of specific statistical sites on the web, but would expect there are lots of them.

I have been in the Denver area for many years, but now live in Littleton. I play the piano at the North Woods Inn every Wednesday night 5:30-9:30 if you would like to come down some time (will be switching to Saturday nights in a few weeks).

I hope my answer to your regression question helped. Just try not to regress yourself. :wink:

Damon

IML · Oct 3, 2002

I work in cherry creek but live in Highlands Ranch. Take the dogs to chatfield right by the northwoods inn on a regular basis. Look for someone carrying a lap top murmering something about r squared's or coefficient correlations.

Jim G · Oct 14, 2002

Damon,

Many thanks for the info. I went back to the stat books while awaiting a reply. I was able to work through the necessary linear data transformations of the power, logarithmic, and exponential graphs.

Thank you again for responding! I found a copy of SPSS on the web and am delving into it as a more advanced statistical package.

Jim G

Curvilinear correlation

Jim G

New Member

Excel Facts

Damon Ostrander

MrExcel MVP

IML

MrExcel MVP

Damon Ostrander

MrExcel MVP

IML

MrExcel MVP

Jim G

New Member

Similar threads

Forum statistics

Share this page

Curvilinear correlation

Jim G

New Member

Excel Facts

Damon Ostrander

MrExcel MVP

IML

MrExcel MVP

Damon Ostrander

MrExcel MVP

IML

MrExcel MVP

Jim G

New Member

Similar threads

Forum statistics

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock