Hi.
I now believe there is a much simpler solution than those presented in this thread. If you could just post a very small example dataset together with expected results then I'll provide you with the solution.
Regards
So I tried this solution and while it does return a non-error result it's not the same result you would get if you removed any rows with blank cells. If all blank cells are in the same rows for both the X and Y columns it works fine, but if there are blanks in either column that aren't in that row for the other column then the results are still skewed unfortunately. Unless I'm doing something wrong here...
It would be a huge help if someone could figure this out! I'm actually surprised the answer doesn't exist somewhere.... it seems incredibly useful to me to be able to have a huge set of data, point to the columns you want to regress and have it automatically ignore missing data. Especially if you are running a large number of regressions with a large amount of data.
Hi.
I now believe there is a much simpler solution than those presented in this thread. If you could just post a very small example dataset together with expected results then I'll provide you with the solution.
Regards
Advanced Excel Techniques: http://excelxor.com/
Maybe...
Hope this helps!Code:=LINEST(N(OFFSET(A3:A54,SMALL(IF(ISNUMBER(A3:A54),IF(ISNUMBER(B3:B13),ROW(A3:A54)-ROW(A3))),ROW(INDIRECT("1:"&SUM(IF(ISNUMBER(A3:A54),IF(ISNUMBER(B3:B13),1)))))),,1)),N(OFFSET(B3:B54,SMALL(IF(ISNUMBER(A3:A54),IF(ISNUMBER(B3:B13),ROW(A3:A54)-ROW(A3))),ROW(INDIRECT("1:"&SUM(IF(ISNUMBER(A3:A54),IF(ISNUMBER(B3:B13),1)))))),,1)))
Try this array formula**, based on data in A1:A100 and B50:B75:
=LINEST(INDEX(A:A,N(IF(1,MODE.MULT(IF(ISNUMBER(A1:A100),{1,1}*ROW(A1:A100)))))),INDEX(B:B,N(IF(1,MODE.MULT(IF(ISNUMBER(B 50:B75),{1,1}*ROW(B50:B75)))))))
It is not important that the two ranges being passed are of the same size, though of course it is important that the number of numbers in each is the same.
Regards
**Array formulas are not entered in the same way as 'standard' formulas. Instead of pressing just ENTER, you first hold down CTRL and SHIFT, and only then press ENTER. If you've done it correctly, you'll notice Excel puts curly brackets {} around the formula (though do not attempt to manually insert these yourself).
Advanced Excel Techniques: http://excelxor.com/
Hey guys,
Thanks for the fast replies! Those don't seem to be exactly what I'm looking for though, unless I'm using them incorrectly. What I'm describing would be for a multivariate regression, on a data set where the Y, X1, X2, etc could all have randomly available data in different rows. The formula would have to first figure out which rows were occupied for all of the selected columns, and then only run the function on those rows. For example, it would only use the green rows here:
EDIT: table formatting
Y X1 X2 X3
4
3
2
3
1
0
4
3
2
8
1
3
5
6
5
8
9
1
2
4
5
Last edited by dnegovan; Jul 28th, 2015 at 07:14 PM.
Granted, I know I could easily just delete the rows with empty cells, but for hundreds of regressions I end up making hundreds of sheets with different smaller samples of the same original data. Just feels like there should be a better way to do it.
UPDATE: found a solution! I hope it's ok to post this link here:
https://newtonexcelbach.wordpress.co...ata-with-gaps/
The spreadsheet called LinEst-Poly.xls linked in that blog post has a custom function built in Visual Basic called LINESTgap that does exactly this! I just copied the module over to my spreadsheet and I am rocking and rolling.
That's great. The UDF will likely be more efficient, at least more efficient than my formula. For what it's worth, though, I'll post a modified version of my mine. I'm not a statistician, but if I understand correctly, and if there were no blank cells, the formula would be as follows...
And, if I continue to understand correctly, the expected results based on the sample data that includes blank cells should be...Code:=LINEST(A2:A7,B2:D7,TRUE,TRUE)
0.666667 -0.66667 0 0.333333
If I understood correctly, select F2:I2, enter the following formula, and confirm with CONTROL+SHIFT+ENTER...
If I didn't understand correctly, please ignore.Code:=LINEST(N(OFFSET(A2:A53,SMALL(IF(SUBTOTAL(3,OFFSET(A2:D53,ROW(A2:D53)-ROW(A2),,1))=COLUMNS(A2:D53),ROW(A2:A53)-ROW(A2)),ROW(INDIRECT("1:"&SUM(IF(SUBTOTAL(3,OFFSET(A2:D53,ROW(A2:D53)-ROW(A2),,1))=COLUMNS(A2:D53),1))))),,1)),N(OFFSET(B2:D53,SMALL(IF(SUBTOTAL(3,OFFSET(A2:D53,ROW(A2:D53)-ROW(A2),,1))=COLUMNS(A2:D53),ROW(A2:A53)-ROW(A2)),ROW(INDIRECT("1:"&SUM(IF(SUBTOTAL(3,OFFSET(A2:D53,ROW(A2:D53)-ROW(A2),,1))=COLUMNS(A2:D53),1))))),COLUMN(B2:D53)-COLUMN(B2),1,1)),TRUE,TRUE)
I posted a solution to this very problem not so long ago:
http://www.mrexcel.com/forum/excel-questions/865910-create-non-contiguous-range.html
Post #9 is the one which is of interest to you here, I believe.
Regards
Advanced Excel Techniques: http://excelxor.com/
If you are comfortable with VBA I would recommend these threads:
https://www.reddit.com/r/excel/comme...text=8&depth=9
https://stackoverflow.com/questions/...range#27138191
The first one uses a function to get move all the blank spaces to the bottom. The second shows how to call on LINEST inside VBA. I used them to create my one modified version of LINEST that can take a range with blank cells.
It has one problem though; the blank cells have to be the same for the x and y values.
My version (for finding the quadratic function):
Code:Function LINESTMOD(ry As Range, rx As Range, konst As Boolean, stats As Boolean) As Variant Dim vectorX() As Double Dim vectorY() As Double Y = NOBLANKS(ry) X = NOBLANKS(rx) lb = LBound(Y) ub = lb If Y(ub) = "" Then LINESTMOD = CVErr(xlErrRef) Exit Function Else While Y(ub + 1) <> "" ub = ub + 1 Wend End If 'you need to define matrix otherwise it doesn't work ReDim vectorX(lb To ub, 0 To 0) ReDim vectorY(lb To ub, 0 To 0) For i = lb To ub vectorX(i, 0) = X(i) vectorY(i, 0) = Y(i) Next LINESTMOD = Application.LinEst(vectorY, Application.Power(vectorX, Array(1, 2)), konst, stats) End Function
The function "NOBLANKS" can be found in the fist link above and you can use it as you would use the normal LINEST function.
If you want to find the linear function you just write "vectorX" instead of "Application.Power(vectorX, Array(1, 2))"
Like this thread? Share it with others