Regression

AudreyJ

New Member
Joined
Oct 4, 2006
Messages
7
Hi All,

Does anyone know how to predict a value from multiple dependant variables? I know how to run the data analysis tool to get the "residual output" but that's it. Also, I have variables which are drivers which are not numbers, they are characteristics like "region", like US, Japan or Europe. Does anyone know how I can account for these in my prediction?

In the "excel help", they say you can do the following:

For example, you can analyze how an athlete's performance is affected by such factors as age, height, and weight. You can apportion shares in the performance measure to each of these three factors, based on a set of performance data, and then use the results to predict the performance of a new, untested athlete.

I basically need to know how to do that.

Any help would be greatly appreciated:) I am very stressed here trying to figure this out!

- Audrey
 

Some videos you may like

Excel Facts

Format cells as time
Select range and press Ctrl+Shift+2 to format cells as time. (Shift 2 is the @ sign).

Oaktree

MrExcel MVP
Joined
Jun 20, 2002
Messages
7,984
Categorical variables are a bit different.

As an example, if gender is a variable, your data should include one column (of 1/0 for true/false) for EITHER male OR female. If you try to do both, Excel will give you an error message which says "Regression - LINEST() function returns error" because of the dependancy one variable has on another. In a generic sense, if you have n outcomes for your categorical variable, you should have n-1 columns (4 quarters means 3 variables, etc.) where the "nth" outcome is used as the base case.

Essentially, what your regression model will tell you is what impact the categorical variable has against the base value. In the gender example, if you have a column of 1's/0's telling you whether each subject is male, the regression output will tell you the impact being male has over/under the assumed base case of being female.
 

AudreyJ

New Member
Joined
Oct 4, 2006
Messages
7
Thanks NateO and Oaktree!

How would you handle a non-numerical driver such as geographical region which has more than 2 possibilities such as male vs. female?

- Audrey
 

AudreyJ

New Member
Joined
Oct 4, 2006
Messages
7

ADVERTISEMENT

P.S. Oaktree,

I understand your male/female example: use only 1 column to indicate if the characteristic is not male, or if it is male -- I think you were trying to illustrate multiple possibilities but I'm not sure I fully understand, my apologies, if you could explain a little further I would greatly appreciate it!

- Audrey
 

Oaktree

MrExcel MVP
Joined
Jun 20, 2002
Messages
7,984
Using your example of the U.S., Europe, and Japan, set up two columns: one with 1/0's corresponding to whether the record is from Europe and one with 1/0's corresponding to whether the record is from Japan.

The coefficient outputs from your regression model for your Europe and Japan variables will tell you the lift/penalty Europe and Japan get over/under the base of the U.S.
 

AudreyJ

New Member
Joined
Oct 4, 2006
Messages
7

ADVERTISEMENT

Thank you Oaktree, you are the best!

I used one set of multiple varibles, and the colums with 1s worked like a charm, then I included another set of mutiple variables (definitely used n-1), and for some reason I get the error you mentioned.

Do you have any idea why? I summed all of the columns, each one has at least 1 in it (had that problem before and fixed it), I'm definitely using n-1 columns.

If you have any insight, phalease let me know.

- Audrey
 

AudreyJ

New Member
Joined
Oct 4, 2006
Messages
7
Oaktree,

Regarding my last posting, please ignore it. I made a dumb mistake and corrected it :)

-Greta
 

AudreyJ

New Member
Joined
Oct 4, 2006
Messages
7
Hi there,

I'm still working on this regression analysis. Your help has been amazing! I have an additional question. Can anyone tell me why I might be getting "NUM!" as P-Values in my output? I'm not sure what that might mean. I deleted a few fields of data to increase my R Square and all of a sudden I'm getting "NUM!" as several of my P-Values.

- Greta
 

Watch MrExcel Video

Forum statistics

Threads
1,109,545
Messages
5,529,458
Members
409,879
Latest member
Aussie_Excel_Wanna_Be
Top