Multiple OLS Regressions

realtylerdurdon

New Member
Joined
Jan 20, 2004
Messages
44
Hi Forum,

I have a long list of variables that looks like this one:

Year;IndustryCode;Sales;EBIT;...
...
...
...


I would like to do simple OLS regressions with LINEST() based on the combination of Year and IndustryCode. For each combination (e.g. Year 1999 and IndustryCode 23) I need a separate regression (all in all over 300 regressions). The number of records for these "blocks" varies and I need a trick to tell the LINEST() to choose the right arrays.

Any ideas?

Thanks a lot in advance!

Michael
 
Hi Michael

I'll be glad to help but I can only do it Sunday night or Monday.

Thanks alot. Monday is still fine, I am not in an extreme hurry.

When you use more explaining variables can I consider that the columns are sequential, meaning the first column Y, and the next the Xs, or will you need some different layout?

The layout you propose would work great. The macro would need to be capable of adjusting the number of Xs (e.g. 1 to 10 Xs). With comments in the code I can do this by hand.

Another layout problem, maybe more complex is the result layout. Now with more explaining variables and the stats how do you want to have the result? can you post an example?

It would be ideal if the results would be in a single line. If this is too hard to do, the results could be in the usual 5 rows x (n+1) columns array (n=number of variables + 1 for the potential constant b) and have the combination of Year and SIC as identifier in the columns to the left of this array (sorry, html maker is not working for me):

1999;23;ROW1COL1ofResultArray;ROW1COL2ofResultArray;...
1999;23;ROW2COL1ofResultArray;ROW2COL2ofResultArray;...
1999;23;ROW3COL1ofResultArray;ROW3COL2ofResultArray;...
1999;23;ROW4COL1ofResultArray;ROW4COL2ofResultArray;...
1999;23;ROW5COL1ofResultArray;ROW5COL2ofResultArray;...
1999;24;ROW1COL1ofResultArray;ROW1COL2ofResultArray;...
...


If you want to try to get the stats, just add 2 parameters True to the linest and the variable Lin will be loaded with all the values in a 2-dimentional array(5,2).

Tried that and got runtime error 9, subscript out of range. Sorry, my VBA skills are nearly nonexistent... ;)

Thanks alot!
Michael
 
Upvote 0

Excel Facts

Which Excel functions can ignore hidden rows?
The SUBTOTAL and AGGREGATE functions ignore hidden rows. AGGREGATE can also exclude error cells and more.
Hi Michael

This solution includes a variable number of explaining variables and returns also the auxilliary stats of the OLS.

Assuming the following setup:

- Columns A:M house your raw data, with the headers in Row 1.

- Column A:Year, column B: Code, Column C: Y values, columns D:M X values of the explaining variables for the regression, maximum of 10 explaining variables.

- The raw data is sorted first key Year, second key Code.

- In N2: The number of explaining variables you want to use. The code will use this number of explaining variables, in columns from D to the right.

- In columns O:P, the table with the unique combinations of Year and Code

The code will write the OLS values, including the stats, in the columns to the right of column P.

Remarks

1 - I changed the order of the values in the result.

The default order in LINEST for the coefficients is the reverse order of the explaining variables. Ex you have the variables X1, X2, X3 and the line parameters come in this order m3, m2, m1, b
I changed this and give you the line coefficients in the same order as the explaining variables, with the independent term at the end. So, for the variables X1, X2, X3 the order from left to right in the cells of the worksheeet is m1, m2, m3, b. I find this more logical. I hope you agree.

2 - Why is it a fixed number of columns for the explaining variables

I find it easier like this because when I did stats I remember we had to run lots of regressions with not only different numbers of explaining variables, but also different sets of explaining variables. With this setup you can import all the raw data into the spreadsheet and just by moving the columns around and changind the value in N2 you can run a lot of different regressions without importing data again.

3 - Each time the code starts the results of the previous OLS are erased.

Please try the code and let me know how it behaves.
PGC

P. S. The results part of the worksheet comes in the next post.

Code:
Sub LinestYrCd()
Dim rY As Range, rCY As Range, rFilt As Range, vLin
Dim iEV As Integer, i As Integer

Application.ScreenUpdating = False
ActiveSheet.AutoFilterMode = False

Columns("Q:AP").ClearContents
iEV = Range("N2").Value ' Number of explaining variables
With Range("q1")        ' Write the headers
    For i = 1 To iEV
        .Offset(, i - 1) = "m" & i
        .Offset(, iEV + i) = "se" & i
    Next
    .Offset(, iEV) = "b"
    .Offset(, 2 * iEV + 1).Resize(1, 7) = Array("seb", "r2", "sey", "F", "df", "ssreg", "sresid")
End With

Set rY = Range("o2", Range("o" & Rows.Count).End(xlUp))

For Each rCY In rY
    Range("A:M").AutoFilter Field:=1, Criteria1:=rCY.Value
    Range("A:M").AutoFilter Field:=2, Criteria1:=rCY.Offset(, 1).Value
    Set rFilt = Range("C2:M" & Range("C" & Rows.Count).End(xlUp).Row).SpecialCells(xlCellTypeVisible)

    vLin = WorksheetFunction.LinEst(rFilt.Columns(1), rFilt.Columns(2).Resize(, iEV), True, True)
    ' Writes the coefficients and their standard errors
    For i = 1 To iEV + 1
        rCY.Offset(, 2 + iEV - i + IIf(i = iEV + 1, i, 0)) = vLin(1, i)
        rCY.Offset(, 3 + 2 * iEV - i + IIf(i = iEV + 1, i, 0)) = vLin(2, i)
    Next
    'Writes the other stats
    For i = 1 To 3
        rCY.Offset(, 2 + 2 * i + 2 * iEV) = vLin(2 + i, 1)
        rCY.Offset(, 3 + 2 * i + 2 * iEV) = vLin(2 + i, 2)
    Next
    ActiveSheet.AutoFilterMode = False
Next
Application.ScreenUpdating = True
End Sub
ols.xls
ABCDEFGHIJKLMN
1YearCodeYX1X2X3X4X5X6X7X8X9X10NExplVars
2199925121231112
319992516525123
4199925181029143
5199925241566177
6199925261669198
7199925281772224
819992314145111
919992316345134
1019992319648136
1119992324955178
12199923271256211
1320002313133104
1420002318535136
15200023241541167
16
17
Sheet1
 
Upvote 0
Now the results:

Remark: as expected, since the combination 2000-23 has only 3 Y values and we are using 2 explaining variables, the result is exact.
ols.xls
OPQRSTUVWXYZAAABAC
1YearCodem1m2bse1se2sebr2seyFdfssregsresid
21999250.611570.0975219.5272730.2096820.0573341.1259820.9811031.11488877.879433193.60443.728926
31999230.8562020.312843-0.888040.1051960.0871543.7214020.9988090.265052838.82812117.85950.140505
42000234.5-6.5223000100060.666676.46E-27
5
6
Sheet1
 
Upvote 0
Hi pgc01,

thanks alot! All the judgements you made (i.e. changing output order, fixed amount of X columns) are perfect. Unfortunately when I run the macro I get "runtime error 1004: Application defined or object defined error". Just FYI: This happend to me on my Desktop (German settings) ans on my laptop (US settings).

The status quo after the error is:
- Applied Autofilter from colums A to M
- Applied filter criteria that doesn't show any values (seems to be wrong?)
- Wrote headers in line 1 starting in column Q

One more little thing: How can I change whether a constant is used in the LINEST regression or not? Could you insert a switch for this, like the number of Xs? That would be great!

Thanks alot for your valuable help!
Michael
 
Upvote 0
Hi Michael

You are right, the second line (Applied filter criteria that doesn't show any values ) in your status quo should never happen. This would mean that you are choosing a combination of Year-Code that does not exist in your data.

We'll have to debug it. Please do the following test (I just did it and it worked).

Open a new excel. Paste the exact values in my example to columns A:P. You can paste them directly from the post.

Insert a module and paste the code.

Run the code

This should run with no problems.

Also in the test you did before, check if there were Year-Code combinations in the columns O:P that did not exist in the raw data table in the columns A:B.

Please let me know what happens.
PGC
Book1
ABCDEFGHIJKLMNOP
1YearCodeYX1X2X3X4X5X6X7X8X9X10NExplVarsYearCode
2199925121231112199925
319992516525123199923
4199925181029143200023
5199925241566177
6199925261669198
7199925281772224
819992314145111
919992316345134
1019992319648136
1119992324955178
12199923271256211
1320002313133104
1420002318535136
15200023241541167
Sheet1
 
Upvote 0
It is working now. You were right, there was a hickup in the unique combinations array. Your macro is a great help for me, I am very thankful.

In order to eliminate the constant in the regression, can I simply replace the first TRUE argument in the LINEST() in line 26 or is it more complicated?

Thanks again!
Michael
 
Upvote 0
Hi Michael

I'm glad it works!

You are right, to include or not the independent term in the ols estimation you just have to change the value of the third parameter of the LINEST.

However you can also specify it in the worksheet and use it in the macro.
With this modified macro you enter in N4 the boolean value TRUE or FALSE and this value will be used in LINEST.

Remark: if N4 = FALSE, the standard error for the independent term will be Not Available.

Hope this solves your problem
PGC

P. S. The results of my example will show in the next post.

Code:
Sub LinestYrCd()
Dim rY As Range, rCY As Range, rFilt As Range, vLin
Dim iEV As Integer, i As Integer, bConst As Boolean

Application.ScreenUpdating = False
ActiveSheet.AutoFilterMode = False

Columns("Q:AP").ClearContents
iEV = Range("N2").Value ' Number of explaining variables
bConst = Range("N4").Value ' OLS with constant term?
With Range("q1")        ' Write the headers
    For i = 1 To iEV
        .Offset(, i - 1) = "m" & i
        .Offset(, iEV + i) = "se" & i
    Next
    .Offset(, iEV) = "b"
    .Offset(, 2 * iEV + 1).Resize(1, 7) = Array("seb", "r2", "sey", "F", "df", "ssreg", "sresid")
End With

Set rY = Range("o2", Range("o" & Rows.Count).End(xlUp))

For Each rCY In rY
    Range("A:M").AutoFilter Field:=1, Criteria1:=rCY.Value
    Range("A:M").AutoFilter Field:=2, Criteria1:=rCY.Offset(, 1).Value
    Set rFilt = Range("C2:M" & Range("C" & Rows.Count).End(xlUp).Row).SpecialCells(xlCellTypeVisible)

    vLin = WorksheetFunction.LinEst(rFilt.Columns(1), rFilt.Columns(2).Resize(, iEV), bConst, True)
    ' Writes the coefficients and their standard errors
    For i = 1 To iEV + 1
        rCY.Offset(, 2 + iEV - i + IIf(i = iEV + 1, i, 0)) = vLin(1, i)
        rCY.Offset(, 3 + 2 * iEV - i + IIf(i = iEV + 1, i, 0)) = vLin(2, i)
    Next
    'Writes the other stats
    For i = 1 To 3
        rCY.Offset(, 2 + 2 * i + 2 * iEV) = vLin(2 + i, 1)
        rCY.Offset(, 3 + 2 * i + 2 * iEV) = vLin(2 + i, 2)
    Next
    ActiveSheet.AutoFilterMode = False
Next
Application.ScreenUpdating = True
End Sub
ols.xls
ABCDEFGHIJKLMN
1YearCodeYX1X2X3X4X5X6X7X8X9X10NExplVars
2199925121231112
319992516525123b
4199925181029143FALSE
5199925241566177
6199925261669198
7199925281772224
819992314145111
919992316345134
1019992319648136
1119992324955178
12199923271256211
1320002313133104
1420002318535136
15200023241541167
16
17
Sheet1
 
Upvote 0
And here the results.

Remark: As expected, with no independent term the 2 variables can no longer explain completely the 3 Y values in the case 2000-23
ols.xls
OPQRSTUVWXYZAAABAC
1YearCodem1m2bse1se2sebr2seyFdfssregsresid
21999250.1530380.36924300.8747220.205108#N/A0.5301454.8145112.2566324104.615392.71807
31999230.8798760.29208200.0289690.004263#N/A0.9987750.2194731223.3683117.85550.144505
42000230.5171720.40484800.1732540.043427#N/A0.9596971.56367111.90593158.22162.445067
5
6
Sheet1
 
Upvote 0

Forum statistics

Threads
1,214,833
Messages
6,121,868
Members
449,053
Latest member
Mesh

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top