how to run regression on data including non-numeric data?

G

Guest

Guest
Input range contains non-numeric data such as "#N/A". How can I run regression without deleting these non-numeric data one by one?
 

Excel Facts

How to total the visible cells?
From the first blank cell below a filtered data set, press Alt+=. Instead of SUM, you will get SUBTOTAL(9,)
You could sort the #N/A errors to the bottom of your list and have the TREND() function ignore them. For example, suppose that A1:A5 contained {10;30;20;#N/A;40}. {=TREND(A1:A5)} entered into D1:D5 produces #VALUE! errors. Instead, enter the array formula...

{=TREND(A1:OFFSET(A1,SUM(ISNUMBER(A1:A5)+0)-1,))}

...into D1:D5. Next, enter the formula, =ROW()+A1*0, into B1 and fill down to B5. Now, sort on column B. Voila!! The #N/A errors are sorted to the bottom of the list and ignored by TREND().
This message was edited by Mark W. on 2002-02-25 15:36
 
Upvote 0
On 2002-02-25 15:30, Mark W. wrote:
You could sort the #N/A errors to the bottom of your list and have the TREND() function ignore them. For example, suppose that A1:A5 contained {10;30;20;#N/A;40}. {=TREND(A1:A5)} entered into D1:D5 produces #VALUE! errors. Instead, enter the array formula...

{=TREND(A1:OFFSET(A1,SUM(ISNUMBER(A1:A5)+0)-1,))}

...into D1:D5. Next, enter the formula, =ROW()+A1*0, into B1 and fill down to B5. Now, sort on column B. Voila!! The #N/A errors are sorted to the bottom of the list and ignored by TREND().
This message was edited by Mark W. on 2002-02-25 15:36

How about:

=TREND(IF(ISNUMBER(A1:A5),A1:A5))

array-entered in D1:D5?
 
Upvote 0
On 2002-02-25 15:39, Aladin Akyurek wrote:
On 2002-02-25 15:30, Mark W. wrote:
You could sort the #N/A errors to the bottom of your list and have the TREND() function ignore them. For example, suppose that A1:A5 contained {10;30;20;#N/A;40}. {=TREND(A1:A5)} entered into D1:D5 produces #VALUE! errors. Instead, enter the array formula...

{=TREND(A1:OFFSET(A1,SUM(ISNUMBER(A1:A5)+0)-1,))}

...into D1:D5. Next, enter the formula, =ROW()+A1*0, into B1 and fill down to B5. Now, sort on column B. Voila!! The #N/A errors are sorted to the bottom of the list and ignored by TREND().
This message was edited by Mark W. on 2002-02-25 15:36

How about:

=TREND(IF(ISNUMBER(A1:A5),A1:A5))

array-entered in D1:D5?

Aladin, using my example =TREND(IF(ISNUMBER(A1:A5),A1:A5,)) would produce {10;30;20;0;40}, and a TREND result of {14;17;20;23;26} rather than {13;21;29;37}.
This message was edited by Mark W. on 2002-02-25 15:50
 
Upvote 0
Mark,

Even worse...

Showeling down #N/A's by sorting and determining the range with OFFSET to which to apply TREND is a good idea.

Addendum:

By the way, although it boils down to the same thing, after sorting the data in A,in D1:D5 we can array-enter:

=TREND(OFFSET(A1,0,0,MATCH(9.99999999999999E+307,A:A),1))

It just occurred to me: Does the sorting affect the results, statistically speaking?
This message was edited by Aladin Akyurek on 2002-02-25 16:14
 
Upvote 0
On 2002-02-25 16:02, Aladin Akyurek wrote:
Mark,

Even worse...

Showeling down #N/A's by sorting and determining the range with OFFSET to which to apply TREND is a good idea.

Didn't quite follow your "Even worse..." statement. :confused:

Duh, I get it now... you were referring to the TREND results. :)
This message was edited by Mark W. on 2002-02-25 16:12
 
Upvote 0
On 2002-02-25 16:02, Aladin Akyurek wrote:
Mark,

Even worse...

Showeling down #N/A's by sorting and determining the range with OFFSET to which to apply TREND is a good idea.

Addendum:

By the way, although it boils down to the same thing, after sorting the data in A,in D1:D5 we can array-enter:

=TREND(OFFSET(A1,0,0,MATCH(9.99999999999999E+307,A:A),1))

It just occurred to me: Does the sorting affect the results, statistically speaking?
This message was edited by Aladin Akyurek on 2002-02-25 16:14

>Does the sorting affect the results, statistically speaking?

This approach is no different the deleting the #N/A errors 1 by 1. In effect #N/A in this context means "no data". Since the original order of the data is unaffected by my approach I don't see how this will have a statistical impact. =TREND(OFFSET(A1,0,0,MATCH(9.99999999999999E+307,A:A),1)) works fine too.
This message was edited by Mark W. on 2002-02-25 16:22
 
Upvote 0
On 2002-02-25 16:19, Mark W. wrote:
On 2002-02-25 16:02, Aladin Akyurek wrote:
Mark,

Even worse...

Showeling down #N/A's by sorting and determining the range with OFFSET to which to apply TREND is a good idea.

Addendum:

By the way, although it boils down to the same thing, after sorting the data in A,in D1:D5 we can array-enter:

=TREND(OFFSET(A1,0,0,MATCH(9.99999999999999E+307,A:A),1))

It just occurred to me: Does the sorting affect the results, statistically speaking?
This message was edited by Aladin Akyurek on 2002-02-25 16:14

>Does the sorting affect the results, statistically speaking?

This approach is no different the deleting the #N/A errors 1 by 1. In effect #N/A in this context means "no data". Since the original order of the data is unaffected by my approach I don't see how this will have a statistical impact. =TREND(OFFSET(A1,0,0,MATCH(9.99999999999999E+307,A:A),1)) works fine too.
This message was edited by Mark W. on 2002-02-25 16:22

Just did a small check: TREND produces the same seq of vals on a series that is scrambled.

Another thing: Why not substitute the mean of the series for the #N/A's (for the missing values, so to speak) if these values do not exceed, say, 5% of the series?

If this is sound, an additional column can be created using

=IF(ISNUMBER(A1),A1,AVERAGE(A1:A5))

which allows to apply TREND to the whole set as one would do ordinarily.

What do you think?
 
Upvote 0
On 2002-02-25 16:31, Aladin Akyurek wrote:
On 2002-02-25 16:19, Mark W. wrote:
On 2002-02-25 16:02, Aladin Akyurek wrote:
Mark,

Even worse...

Showeling down #N/A's by sorting and determining the range with OFFSET to which to apply TREND is a good idea.

Addendum:

By the way, although it boils down to the same thing, after sorting the data in A,in D1:D5 we can array-enter:

=TREND(OFFSET(A1,0,0,MATCH(9.99999999999999E+307,A:A),1))

It just occurred to me: Does the sorting affect the results, statistically speaking?
This message was edited by Aladin Akyurek on 2002-02-25 16:14

>Does the sorting affect the results, statistically speaking?

This approach is no different the deleting the #N/A errors 1 by 1. In effect #N/A in this context means "no data". Since the original order of the data is unaffected by my approach I don't see how this will have a statistical impact. =TREND(OFFSET(A1,0,0,MATCH(9.99999999999999E+307,A:A),1)) works fine too.
This message was edited by Mark W. on 2002-02-25 16:22

Just did a small check: TREND produces the same seq of vals on a series that is scrambled.

Another thing: Why not substitute the mean of the series for the #N/A's (for the missing values, so to speak) if these values do not exceed, say, 5% of the series?

If this is sound, an additional column can be created using

=IF(ISNUMBER(A1),A1,AVERAGE(A1:A5))

which allows to apply TREND to the whole set as one would do ordinarily.

What do you think?

> Just did a small check: TREND produces the same seq of vals on a series that is scrambled.

That's not my finding. If after sorting the #N/A errors to bottom I substitute RAND() for ROW()+A1*0 I get different TREND() results.

>Why not substitute the mean of the series for the #N/A's

It's not the same. Try it! Bottom line: the order and magnitude of these values affect the regression line.
 
Upvote 0
That was me above. My logon must have timed out...

I also wanted to point out that AVERAGE(A1:A5) where A1:A5 contains an #N/A error will produce #N/A.
 
Upvote 0

Forum statistics

Threads
1,214,411
Messages
6,119,356
Members
448,888
Latest member
Arle8907

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top