SEMI-covariance matrix in VBA

Excel Facts

Excel Wisdom
Using a mouse in Excel is the work equivalent of wearing a lanyard when you first get to college
VBA Code:
Function SemiVar(r As Range) As Double
  ' shg 2019
  ' https://www.investopedia.com/terms/s/semivariance.asp
 
  Dim adRt          As Variant
  Dim i             As Long
  Dim j             As Long
  Dim dAvg          As Double
  Dim dSumSq        As Double

  If r.Cells.Count > 1 Then
    adRt = r.Value2

    For i = 1 To UBound(adRt, 1)
      For j = 1 To UBound(adRt, 2)
        dAvg = dAvg + adRt(i, j)
      Next j
    Next i

    dAvg = dAvg / ((i - 1) * (j - 1))

    For i = 1 To UBound(adRt, 1)
      For j = 1 To UBound(adRt, 2)
        If adRt(i, j) < dAvg Then dSumSq = dSumSq + (dAvg - adRt(i, j)) ^ 2
      Next j
    Next i
 
    SemiVar = dSumSq / ((i - 1) * (j - 1))
  End If
End Function
 
Upvote 0
@shg

When I am using it as a array formula I am not getting the output.

1.2400​
3.4000​
4.8600​
4.9200​
7.5600​
7.8900​
9.3800​
1.8700​
3.1400​
4.0400​
5.7000​
7.2800​
6.6900​
8.0600​
2.4000​
3.6600​
3.1800​
6.5300​
7.7800​
9.4100​
10.6500​
3.1300​
2.1300​
6.2000​
6.2900​
6.9400​
10.0800​
7.8900​
1.4800​
5.1600​
4.2700​
5.9300​
5.6600​
6.8800​
9.6500​
1.3500​
5.3600​
6.8300​
6.1300​
5.9400​
7.6900​
12.0400​
2.8400​
4.3600​
5.2400​
7.7400​
10.7400​
10.8000​
10.1000​
4.8400​
4.2100​
8.5000​
7.2000​
9.7400​
8.2900​
9.4700​
4.7700​
6.7300​
8.8800​
10.4100​
10.4400​
13.3900​
9.4700​
1.2600​
7.6700​
8.5300​
5.6500​
10.3700​
11.2700​
10.8600​

I am putting SemiVar(B2:H11) and I am getting 4.3444 across cells.
 
Upvote 0
My bad; misread the algorithm.

VBA Code:
Function SemiVar(r As Range) As Variant
  ' shg 2019
  ' https://www.investopedia.com/terms/s/semivariance.asp

  Dim n             As Long     ' num obervations, then num oservations < avg
  Dim adRt          As Variant  ' input data
  Dim i             As Long     ' row index
  Dim j             As Long     ' column index
  Dim dAvg          As Double
  Dim dSumSq        As Double

  n = r.Cells.Count
  If n = 1 Then
    SemiVar = CVErr(xlErrDiv0)
  
  Else
    adRt = r.Value2

    For i = 1 To UBound(adRt, 1)
      For j = 1 To UBound(adRt, 2)
        dAvg = dAvg + adRt(i, j)
      Next j
    Next i

    dAvg = dAvg / n
    n = 0

    For i = 1 To UBound(adRt, 1)
      For j = 1 To UBound(adRt, 2)
        If adRt(i, j) < dAvg Then
          n = n + 1
          dSumSq = dSumSq + (dAvg - adRt(i, j)) ^ 2
        End If
      Next j
    Next i

    If n = 0 Then SemiVar = CVErr(xlErrDiv0) Else SemiVar = dSumSq / n
  End If
End Function
 
Upvote 0
Assuming that's correct, the second formula gives the same result and is surely faster:

A​
B​
C​
D​
E​
F​
G​
H​
I​
J​
1​
1.24​
3.40​
4.86​
4.92​
7.56​
7.89​
9.38​
8.944390​
I1: =SemiVar(A1:G10)
2​
1.87​
3.14​
4.04​
5.70​
7.28​
6.69​
8.06​
3​
2.40​
3.66​
3.18​
6.53​
7.78​
9.41​
10.65​
8.944390​
I3: {=AVERAGE(IF(A1:G10 < AVERAGE(A1:G10), (AVERAGE(A1:G10) - A1:G10)^2))}
4​
3.13​
2.13​
6.20​
6.29​
6.94​
10.08​
7.89​
5​
1.48​
5.16​
4.27​
5.93​
5.66​
6.88​
9.65​
6​
1.35​
5.36​
6.83​
6.13​
5.94​
7.69​
12.04​
7​
2.84​
4.36​
5.24​
7.74​
10.74​
10.80​
10.10​
8​
4.84​
4.21​
8.50​
7.20​
9.74​
8.29​
9.47​
9​
4.77​
6.73​
8.88​
10.41​
10.44​
13.39​
9.47​
10​
1.26​
7.67​
8.53​
5.65​
10.37​
11.27​
10.86​
 
Upvote 0
@shg

Two leading points on the subject:
1. I found another post where TusharM cites a formula based method, but the results of the above UDF method do not match with formula method. The link is:
Formula or UDF to calculate Semi Variance

2. In the original post from 2010 there is also an output of covariance matrix for the above data; can that be incorporated via a UDF?
SEMI-covariance matrix in VBA - Help needed!

3. In the post of @tusharm; and @Bob Rooney; shares a long UDF but the results from his UDF also don't match.
 
Upvote 0
Maybe you could post that info here with a sample dataset and results known to be correct.
 
Upvote 0
Investopedia's formula (the link in my code), {=AVERAGE(IF(AVERAGE(r) > r, (AVERAGE(r) - r)^2))} calculates the variance of the values below the dataset mean about the dataset mean.

The formula {=VAR(IF(r < AVERAGE(r), r))} calculates the variance of the values below the dataset mean about their mean.

TusharM's formula {=VAR(IF(r < AVERAGE(r), r, 0))}, calculates the variance of something I can't describe.

I have no idea which, if any, is "correct", or if there is a single accepted definition.
 
Upvote 0
I also saw two other definition on the web, one of which calculated the standard deviation instead of the variance, and one of which divided by n-1 instead of n (sample vs population). Seems pretty sketchy.
 
Upvote 0

Forum statistics

Threads
1,214,520
Messages
6,120,008
Members
448,935
Latest member
ijat

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top