# R-Squared/Correlation Question (w/ scatter plot)

#### Jennlax27iga0

##### New Member
Not sure if this is the correct forum, if not any suggestions on where to post would be welcome. I have graphed a scatter plot and trendline in Excel and have been trying to analyze the significance of the r-squared value. The graph covers 4 years so there are 4 points for each team graphed.

In looking at the graph I thought the group circled in red was impacting the r-squared value the most due to it's variance/distance from the trendline, however removing that group lowered the r-squared value. The two groups that actually had the greatest impact/ lowered the r-squared value the most are circled in green (the group with the third greatest impact is circled in purple). If someone can help give me some insight into why this is the case, I would greatly appreciate it (not sure if I'm not understanding the Excel calculation/formula, or the general concept). Thanks.

### Excel Facts

Highlight Duplicates
Home, Conditional Formatting, Highlight Cells, Duplicate records, OK to add pink formatting to any duplicates in selected range.
Not saying this is the best place to go for help, but it is a good start.

http://en.wikipedia.org/wiki/R-squared_value

A low R2 is saying there is too much variability in your model. The future outcomes are not predictable.

Let's say you have a restaurant. In order to figure out how much staff you need, you would use patrons or maybe even meals served as your workload factor because they are most likely the best predictor of how much staff is required.

The more patrons/meals, the higher the staff required >> predictable

The fewer patrons/meals, the fewer the staff required >> predictable

But instead of patrons or meals you build your model on number of windows in the restaurant. What does that have to do with how much staff is required?

It is not a good predictor.

Maybe the areas you have circled are what can be called outlliers. They fall outside of a certain number of standard deviations. Maybe for good reason, but they should not be part of the model. Could be they need their own model.

Another example, we had a graph like yours where there were two different sets of data points. Turned out, it was because they were two different aircraft types; therefore, they needed their own linear regression model.

Hope some of this helps...

Replies
1
Views
522
Replies
2
Views
5K
Replies
6
Views
21K
Replies
1
Views
2K
Replies
2
Views
691

1,212,059
Messages
6,105,659
Members
447,974
Latest member
misspancake

### We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.

### Which adblocker are you using?

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

### Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

### Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back