Blank Cells and LINEST

Tony Collins

New Member
Joined
Jun 5, 2009
Messages
2
I have been trying to use LINEST for the analysis of a large sets of data but have just noticed that this function does not allow for empty cells in the data set. Apart from the tredious possibility of removing the blank cells (which is not really possible for the large quantity of data involved) does anybody have any suggestions?

I have tried using the following but have not had any success:
=LINEST(IF(A3:A54=0,"",A3:A54),IF(B3:B54=0,"",B3:B54),,TRUE)

=LINEST(IF(ISNUMBER(A3:A54),A3:A54,""),IF(ISNUMBER(B3:B54),B3:B54,""))

I do not want the formula to take the empty cells as being 0 but to exclude them from the calculation.

All help would be greatly appreciated.
 
Andrew's formula can be modified as follows...

Code:
=LINEST(N(OFFSET(A3:A54,SMALL(IF(B3:B54<>0,ROW(B3:B54)-ROW(B3)),
     ROW(INDIRECT("1:"&COUNT(B3:B54)))),,1)),N(OFFSET(B3:B54,
          SMALL(IF(B3:B54<>0,ROW(B3:B54)-ROW(B3)),ROW(INDIRECT("1:"&
               COUNT(B3:B54)))),,1)))

...confirmed with CONTROL+SHIFT+ENTER. If, however, the so-called empty cells actually contain formula blanks (""), the formula would need to be modified further.

Edit: Actually, I guess it should be the other way around...

Code:
=LINEST(N(OFFSET(A3:A54,SMALL(IF(A3:A54<>0,ROW(A3:A54)-ROW(A3)),
     ROW(INDIRECT("1:"&COUNT(A3:A54)))),,1)),N(OFFSET(B3:B54,
          SMALL(IF(A3:A54<>0,ROW(A3:A54)-ROW(A3)),ROW(INDIRECT("1:"&
               COUNT(A3:A54)))),,1)))

So I tried this solution and while it does return a non-error result it's not the same result you would get if you removed any rows with blank cells. If all blank cells are in the same rows for both the X and Y columns it works fine, but if there are blanks in either column that aren't in that row for the other column then the results are still skewed unfortunately. Unless I'm doing something wrong here...

It would be a huge help if someone could figure this out! I'm actually surprised the answer doesn't exist somewhere.... it seems incredibly useful to me to be able to have a huge set of data, point to the columns you want to regress and have it automatically ignore missing data. Especially if you are running a large number of regressions with a large amount of data.
 
Upvote 0

Excel Facts

Highlight Duplicates
Home, Conditional Formatting, Highlight Cells, Duplicate records, OK to add pink formatting to any duplicates in selected range.
Hi.

I now believe there is a much simpler solution than those presented in this thread. If you could just post a very small example dataset together with expected results then I'll provide you with the solution.

Regards
 
Upvote 0
Maybe...

Code:
=LINEST(N(OFFSET(A3:A54,SMALL(IF(ISNUMBER(A3:A54),IF(ISNUMBER(B3:B13),ROW(A3:A54)-ROW(A3))),ROW(INDIRECT("1:"&SUM(IF(ISNUMBER(A3:A54),IF(ISNUMBER(B3:B13),1)))))),,1)),N(OFFSET(B3:B54,SMALL(IF(ISNUMBER(A3:A54),IF(ISNUMBER(B3:B13),ROW(A3:A54)-ROW(A3))),ROW(INDIRECT("1:"&SUM(IF(ISNUMBER(A3:A54),IF(ISNUMBER(B3:B13),1)))))),,1)))

Hope this helps!
 
Upvote 0
Try this array formula**, based on data in A1:A100 and B50:B75:

=LINEST(INDEX(A:A,N(IF(1,MODE.MULT(IF(ISNUMBER(A1:A100),{1,1}*ROW(A1:A100)))))),INDEX(B:B,N(IF(1,MODE.MULT(IF(ISNUMBER(B50:B75),{1,1}*ROW(B50:B75)))))))

It is not important that the two ranges being passed are of the same size, though of course it is important that the number of numbers in each is the same.

Regards


**Array formulas are not entered in the same way as 'standard' formulas. Instead of pressing just ENTER, you first hold down CTRL and SHIFT, and only then press ENTER. If you've done it correctly, you'll notice Excel puts curly brackets {} around the formula (though do not attempt to manually insert these yourself).
 
Upvote 0
Hey guys,

Thanks for the fast replies! Those don't seem to be exactly what I'm looking for though, unless I'm using them incorrectly. What I'm describing would be for a multivariate regression, on a data set where the Y, X1, X2, etc could all have randomly available data in different rows. The formula would have to first figure out which rows were occupied for all of the selected columns, and then only run the function on those rows. For example, it would only use the green rows here:

EDIT: table formatting

YX1X2X3

4
3
2
3
1
0
4
3
2
8

1
3
5
6
5
8
9

1
2
4
5

<tbody>
</tbody>



 
Last edited:
Upvote 0
Granted, I know I could easily just delete the rows with empty cells, but for hundreds of regressions I end up making hundreds of sheets with different smaller samples of the same original data. Just feels like there should be a better way to do it.
 
Upvote 0
That's great. The UDF will likely be more efficient, at least more efficient than my formula. For what it's worth, though, I'll post a modified version of my mine. I'm not a statistician, but if I understand correctly, and if there were no blank cells, the formula would be as follows...

Code:
=LINEST(A2:A7,B2:D7,TRUE,TRUE)

And, if I continue to understand correctly, the expected results based on the sample data that includes blank cells should be...

0.666667-0.6666700.333333

<tbody>
</tbody>

If I understood correctly, select F2:I2, enter the following formula, and confirm with CONTROL+SHIFT+ENTER...

Code:
=LINEST(N(OFFSET(A2:A53,SMALL(IF(SUBTOTAL(3,OFFSET(A2:D53,ROW(A2:D53)-ROW(A2),,1))=COLUMNS(A2:D53),ROW(A2:A53)-ROW(A2)),ROW(INDIRECT("1:"&SUM(IF(SUBTOTAL(3,OFFSET(A2:D53,ROW(A2:D53)-ROW(A2),,1))=COLUMNS(A2:D53),1))))),,1)),N(OFFSET(B2:D53,SMALL(IF(SUBTOTAL(3,OFFSET(A2:D53,ROW(A2:D53)-ROW(A2),,1))=COLUMNS(A2:D53),ROW(A2:A53)-ROW(A2)),ROW(INDIRECT("1:"&SUM(IF(SUBTOTAL(3,OFFSET(A2:D53,ROW(A2:D53)-ROW(A2),,1))=COLUMNS(A2:D53),1))))),COLUMN(B2:D53)-COLUMN(B2),1,1)),TRUE,TRUE)

If I didn't understand correctly, please ignore. :)
 
Upvote 0
I have been trying to use LINEST for the analysis of a large sets of data but have just noticed that this function does not allow for empty cells in the data set. Apart from the tredious possibility of removing the blank cells (which is not really possible for the large quantity of data involved) does anybody have any suggestions?

I have tried using the following but have not had any success:
=LINEST(IF(A3:A54=0,"",A3:A54),IF(B3:B54=0,"",B3:B54),,TRUE)

=LINEST(IF(ISNUMBER(A3:A54),A3:A54,""),IF(ISNUMBER(B3:B54),B3:B54,""))

I do not want the formula to take the empty cells as being 0 but to exclude them from the calculation.

All help would be greatly appreciated.

If you are comfortable with VBA I would recommend these threads:
https://www.reddit.com/r/excel/comm...pt_to_ignore_blank/cnhbk5h/?context=8&depth=9
https://stackoverflow.com/questions...ba-call-using-array-instead-of-range#27138191


The first one uses a function to get move all the blank spaces to the bottom. The second shows how to call on LINEST inside VBA. I used them to create my one modified version of LINEST that can take a range with blank cells.


It has one problem though; the blank cells have to be the same for the x and y values.

My version (for finding the quadratic function):
Code:
Function LINESTMOD(ry As Range, rx As Range, konst As Boolean, stats As Boolean) As Variant
    Dim vectorX() As Double
    Dim vectorY() As Double
    Y = NOBLANKS(ry)
    X = NOBLANKS(rx)
    lb = LBound(Y)
    ub = lb
    If Y(ub) = "" Then
        LINESTMOD = CVErr(xlErrRef)
        Exit Function
    Else
        While Y(ub + 1) <> ""
            ub = ub + 1
        Wend
    End If
    'you need to define matrix otherwise it doesn't work
    ReDim vectorX(lb To ub, 0 To 0)
    ReDim vectorY(lb To ub, 0 To 0)
    
    For i = lb To ub
        vectorX(i, 0) = X(i)
        vectorY(i, 0) = Y(i)
    Next


    LINESTMOD = Application.LinEst(vectorY, Application.Power(vectorX, Array(1, 2)), konst, stats) 
End Function



The function "NOBLANKS" can be found in the fist link above and you can use it as you would use the normal LINEST function.
If you want to find the linear function you just write "vectorX" instead of "Application.Power(vectorX, Array(1, 2))"
 
Upvote 0

Forum statistics

Threads
1,214,622
Messages
6,120,580
Members
448,972
Latest member
Shantanu2024

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top