how to run regression on data including non-numeric data?

Guest · Feb 25, 2002

Input range contains non-numeric data such as "#N/A". How can I run regression without deleting these non-numeric data one by one?

Mark W. · Feb 25, 2002

You could sort the #N/A errors to the bottom of your list and have the TREND() function ignore them. For example, suppose that A1:A5 contained {10;30;20;#N/A;40}. {=TREND(A1:A5)} entered into D1:D5 produces #VALUE! errors. Instead, enter the array formula...

{=TREND(A1:OFFSET(A1,SUM(ISNUMBER(A1:A5)+0)-1,))}

...into D1:D5. Next, enter the formula, =ROW()+A1*0, into B1 and fill down to B5. Now, sort on column B. Voila!! The #N/A errors are sorted to the bottom of the list and ignored by TREND().
This message was edited by Mark W. on 2002-02-25 15:36

Aladin Akyurek · Feb 25, 2002

On 2002-02-25 15:30, Mark W. wrote:
You could sort the #N/A errors to the bottom of your list and have the TREND() function ignore them. For example, suppose that A1:A5 contained {10;30;20;#N/A;40}. {=TREND(A1:A5)} entered into D1:D5 produces #VALUE! errors. Instead, enter the array formula...

{=TREND(A1:OFFSET(A1,SUM(ISNUMBER(A1:A5)+0)-1,))}

...into D1:D5. Next, enter the formula, =ROW()+A1*0, into B1 and fill down to B5. Now, sort on column B. Voila!! The #N/A errors are sorted to the bottom of the list and ignored by TREND().
This message was edited by Mark W. on 2002-02-25 15:36

How about:

=TREND(IF(ISNUMBER(A1:A5),A1:A5))

array-entered in D1:D5?

Mark W. · Feb 25, 2002

On 2002-02-25 15:39, Aladin Akyurek wrote:

On 2002-02-25 15:30, Mark W. wrote:
You could sort the #N/A errors to the bottom of your list and have the TREND() function ignore them. For example, suppose that A1:A5 contained {10;30;20;#N/A;40}. {=TREND(A1:A5)} entered into D1:D5 produces #VALUE! errors. Instead, enter the array formula...

{=TREND(A1:OFFSET(A1,SUM(ISNUMBER(A1:A5)+0)-1,))}

...into D1:D5. Next, enter the formula, =ROW()+A1*0, into B1 and fill down to B5. Now, sort on column B. Voila!! The #N/A errors are sorted to the bottom of the list and ignored by TREND().
This message was edited by Mark W. on 2002-02-25 15:36

Click to expand...

How about:

=TREND(IF(ISNUMBER(A1:A5),A1:A5))

array-entered in D1:D5?

Aladin, using my example =TREND(IF(ISNUMBER(A1:A5),A1:A5,)) would produce {10;30;20;0;40}, and a TREND result of {14;17;20;23;26} rather than {13;21;29;37}.
This message was edited by Mark W. on 2002-02-25 15:50

Aladin Akyurek · Feb 25, 2002

Mark,

Even worse...

Showeling down #N/A's by sorting and determining the range with OFFSET to which to apply TREND is a good idea.

Addendum:

By the way, although it boils down to the same thing, after sorting the data in A,in D1:D5 we can array-enter:

=TREND(OFFSET(A1,0,0,MATCH(9.99999999999999E+307,A:A),1))

It just occurred to me: Does the sorting affect the results, statistically speaking?
This message was edited by Aladin Akyurek on 2002-02-25 16:14

Mark W. · Feb 25, 2002

On 2002-02-25 16:02, Aladin Akyurek wrote:
Mark,

Even worse...

Showeling down #N/A's by sorting and determining the range with OFFSET to which to apply TREND is a good idea.

Didn't quite follow your "Even worse..." statement.

Duh, I get it now... you were referring to the TREND results.

This message was edited by Mark W. on 2002-02-25 16:12

Mark W. · Feb 25, 2002

On 2002-02-25 16:02, Aladin Akyurek wrote:
Mark,

Even worse...

Showeling down #N/A's by sorting and determining the range with OFFSET to which to apply TREND is a good idea.

Addendum:

By the way, although it boils down to the same thing, after sorting the data in A,in D1:D5 we can array-enter:

=TREND(OFFSET(A1,0,0,MATCH(9.99999999999999E+307,A:A),1))

It just occurred to me: Does the sorting affect the results, statistically speaking?
This message was edited by Aladin Akyurek on 2002-02-25 16:14

>Does the sorting affect the results, statistically speaking?

This approach is no different the deleting the #N/A errors 1 by 1. In effect #N/A in this context means "no data". Since the original order of the data is unaffected by my approach I don't see how this will have a statistical impact. =TREND(OFFSET(A1,0,0,MATCH(9.99999999999999E+307,A:A),1)) works fine too.
This message was edited by Mark W. on 2002-02-25 16:22

Aladin Akyurek · Feb 25, 2002

On 2002-02-25 16:19, Mark W. wrote:

On 2002-02-25 16:02, Aladin Akyurek wrote:
Mark,

Even worse...

Showeling down #N/A's by sorting and determining the range with OFFSET to which to apply TREND is a good idea.

Addendum:

By the way, although it boils down to the same thing, after sorting the data in A,in D1:D5 we can array-enter:

=TREND(OFFSET(A1,0,0,MATCH(9.99999999999999E+307,A:A),1))

It just occurred to me: Does the sorting affect the results, statistically speaking?
This message was edited by Aladin Akyurek on 2002-02-25 16:14

Click to expand...

>Does the sorting affect the results, statistically speaking?

This approach is no different the deleting the #N/A errors 1 by 1. In effect #N/A in this context means "no data". Since the original order of the data is unaffected by my approach I don't see how this will have a statistical impact. =TREND(OFFSET(A1,0,0,MATCH(9.99999999999999E+307,A:A),1)) works fine too.
This message was edited by Mark W. on 2002-02-25 16:22

Just did a small check: TREND produces the same seq of vals on a series that is scrambled.

Another thing: Why not substitute the mean of the series for the #N/A's (for the missing values, so to speak) if these values do not exceed, say, 5% of the series?

If this is sound, an additional column can be created using

=IF(ISNUMBER(A1),A1,AVERAGE(A1:A5))

which allows to apply TREND to the whole set as one would do ordinarily.

What do you think?

Guest · Feb 25, 2002

On 2002-02-25 16:31, Aladin Akyurek wrote:

On 2002-02-25 16:19, Mark W. wrote:

On 2002-02-25 16:02, Aladin Akyurek wrote:
Mark,

Even worse...

Showeling down #N/A's by sorting and determining the range with OFFSET to which to apply TREND is a good idea.

Addendum:

By the way, although it boils down to the same thing, after sorting the data in A,in D1:D5 we can array-enter:

=TREND(OFFSET(A1,0,0,MATCH(9.99999999999999E+307,A:A),1))

It just occurred to me: Does the sorting affect the results, statistically speaking?
This message was edited by Aladin Akyurek on 2002-02-25 16:14

Click to expand...

>Does the sorting affect the results, statistically speaking?

This approach is no different the deleting the #N/A errors 1 by 1. In effect #N/A in this context means "no data". Since the original order of the data is unaffected by my approach I don't see how this will have a statistical impact. =TREND(OFFSET(A1,0,0,MATCH(9.99999999999999E+307,A:A),1)) works fine too.
This message was edited by Mark W. on 2002-02-25 16:22

Click to expand...

Just did a small check: TREND produces the same seq of vals on a series that is scrambled.

Another thing: Why not substitute the mean of the series for the #N/A's (for the missing values, so to speak) if these values do not exceed, say, 5% of the series?

If this is sound, an additional column can be created using

=IF(ISNUMBER(A1),A1,AVERAGE(A1:A5))

which allows to apply TREND to the whole set as one would do ordinarily.

What do you think?

> Just did a small check: TREND produces the same seq of vals on a series that is scrambled.

That's not my finding. If after sorting the #N/A errors to bottom I substitute RAND() for ROW()+A1*0 I get different TREND() results.

>Why not substitute the mean of the series for the #N/A's

It's not the same. Try it! Bottom line: the order and magnitude of these values affect the regression line.

Mark W. · Feb 25, 2002

That was me above. My logon must have timed out...

I also wanted to point out that AVERAGE(A1:A5) where A1:A5 contains an #N/A error will produce #N/A.

how to run regression on data including non-numeric data?

Guest

Guest

Excel Facts

Mark W.

MrExcel MVP

Aladin Akyurek

MrExcel MVP

Mark W.

MrExcel MVP

Aladin Akyurek

MrExcel MVP

Mark W.

MrExcel MVP

Mark W.

MrExcel MVP

Aladin Akyurek

MrExcel MVP

Guest

Guest

Mark W.

MrExcel MVP

Similar threads

Forum statistics

Share this page

how to run regression on data including non-numeric data?

Guest

Guest

Excel Facts

MrExcel MVP

MrExcel MVP

MrExcel MVP

MrExcel MVP

MrExcel MVP

MrExcel MVP

MrExcel MVP

Guest

Guest

MrExcel MVP

Similar threads

Forum statistics

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock