This is a discussion on Testing For Normality In Excel within the Excel Questions forums, part of the Question Forums category; I have a lot of data relating to meter readings. I believe that a number of the meters have failed ...
I have a lot of data relating to meter readings. I believe that a number of the meters have failed and the readings are invalid. I expect a population of working meters to have a normal distribution. What I want to do is test various subsets of my data for normality to identify which are probably failed.
Does anyone know of an Excel solution to this - I found one on the web but for this the critical value increases as the size of the sample rises which is obviously nonsense! I dont need anything particularly sophisticated but this is beyond my statistics knowledge to build from scratch.
There is a function for calculating normal distribution in Excel (can't remember what it is off the top of my head).
You could use that and test individual reading against it.
Flagellation, necrophilia and bestiality - Am I flogging a dead horse?
Office 2003/2007 (Work) and Office 2007 (Home)
Thanks - yes its called NORMINV or something and using chi squared to test if the sample matches a normal distribution with the same mean and standard deviation sort of works for me and is easy. I would still be interested in other approaches.
I can generate a normal probability plot and caculate the correlation of this which is a measure of "normality" but where I fall over is working out how to calculate the critical value of the correlation that woudl allow me to reject the null hypothesis (that the data isn't normal) or not.
You could create a normal probability plot of your data. If you have a lot of meter readings, this would be better than doing a statistical test (e.g. chi-square) which might detect any deviation from normality in large samples.
The subroutine below creates z-scores and normal scores from raw data. To use it, put your data in the first column of a worksheet called Data. Store the data in a range called dvec. Leave columns 2 and 3 empty and copy the code below to your workbook. When you run the code, it will put the z-scores in the second column and the normal scores in the third.
Create an x-y (scatter) chart with the z-scores on the x-axis and the normal scores on the y-axis. Insert a linear trend line in the chart. If the points are close to the trend line then your data are approximately normal.
Hope this helps. The code is shown below. If you or anyone need a hand, I can send you/him/her an Excel file where I used this sub.
Public Sub normal_scores()
'Returns the z and normal scores which can be used in a probaility plot.
'the data should be stored in a range called Dvec and in a worksheet called Data.
'in the active workbook.
'Sub adapted from Advanced Modelling in Finance Using Excel and VBA
'by Mary Jackson and Mike Stauton p. 61
'Published by Wiley.
Dim mean As Double, std As Double, ranks As Double, c_correction As Double
Dim j As Integer, n As Integer
Dim data_vec As Variant 'An object
'Read Data From Worksheet
'Set up the n x 2 matrix to store the z-scores and the normal scores.
Set data_vec = Sheets("Data").Range("Dvec")
n = Application.Count(data_vec)
'Calculate the z and normal scores for each data value.
'Write these to the 2nd and 3rd columns of the worksheet beginning in row 2.
mean = Application.Average(data_vec)
std = Application.StDev(data_vec)
For j = 1 To n
Cells(j + 1, 2).Value = (data_vec(j) - mean) / std 'z-score
ranks = Application.Rank(data_vec(j), data_vec, 1) 'Rank of jth element in data_vec
c_correction = (ranks - 0.375) / (n + 0.25) 'continuity correction
Cells(j + 1, 3).Value = Application.NormSInv(c_correction)