Best way to show "variance" in set of values

JenniferMurphy

Well-known Member
I am working on a worksheet to analyze blood sugar readings for diabetics. One of the columns contains the actual readings in mg/dL. As diabetic treatment progresses, the goal is to lower blood sugar levels, but also to get the readings to be more consistent (lower the variability). I would appreciate suggestions for the best measure of variability for the purposes of comparing readings for different time periods so as to be able to assess progress.

The table below has 4 sample readings (rows 5-8) over 4 different time periods (columns D-G) showing improvement in the readings. Rows 9-13 show the maximum, average, median, middle, and minimum of these values. Row 14 shows the range, which is one measure of variability. Rows 15 & 167 show the variance and std dev using built-in Excel functions, other measures of variability. I find the variance difficult to fathom. The std dev seems more comprehensible.

Rows 17-19 contain some measures I concocted. Each of them calculate the percentage that one half of one of the measures is of the average or middle values. For example, D17 shows that half of the range (183 / 2 = 91.5) is 27% of the average (338), plus or minus. D18 shows a similar result, but compared to the middle value. D19 does a similar calculation but using the std dev.

I would appreciate any comments or suggestions for better measures.

Excel Facts

What do {} around a formula in the formula bar mean?
{Formula} means the formula was entered using Ctrl+Shift+Enter signifying an old-style array formula.

6StringJazzer

Well-known Member
The variance is the square of the standard deviation. The standard deviation is probably the best measure to use for what you have described.

I am not sure of the logic behind your figures in H17:H19. I also can't sort out what those @ signs are all about. I would just use the standard deviation.

6StringJazzer

Well-known Member
My comment above is if you are just looking for overall consistency. For a more sophisticated analysis, this is a situation that might lend itself to statistical process control, which is a method to monitor consistency, but usually used in manufacturing. This is pretty similar. But you would need more than 4 data points.

AhoyNC

Well-known Member
You might also want to see if there is an actual statistical significance difference between the samples using a paired t-test or ANOVA.
A google search will give you info on both.

JenniferMurphy

Well-known Member

The variance is the square of the standard deviation. The standard deviation is probably the best measure to use for what you have described.
In general, I would agree, but here I am trying for a measure that will show how two or more sets of readings compare with each other with respect to the goal of lowering blood sugar readings into the target range. I am afraid that the standard deviation is a bit too technical and a bit too abstract for my audience.

I do have other measures (not included in the example) that show how far the readings are from the target. Those are great. I would like to add one or two measures that analyze the readings as sets.

I am not sure of the logic behind your figures in H17:H19.
Those are an attempt to come up with a measure that can be expressed in percentages. Since the absolute numbers are constantly changing, absolute differences can be misleading. The range, for example, for a set with max & min = 300 & 250 is the same as for a set with max & min of 125 & 75, but the numbers in the first are much closer together percentage-wise.

I also can't sort out what those @ signs are all about. I would just use the standard deviation.
That is something that was added when I moved the workbook from Excel 2007 to 2016. I got a #SPILL! error, which must be something new since 2007. It then added @ sings in front of each of my named ranges. I haven't had a chance to read up on what is going on.

JenniferMurphy

Well-known Member
My comment above is if you are just looking for overall consistency. For a more sophisticated analysis, this is a situation that might lend itself to statistical process control, which is a method to monitor consistency, but usually used in manufacturing. This is pretty similar. But you would need more than 4 data points.
I'll look into that, but I think I need something fairly simple and, more importantly, intuitive.

JenniferMurphy

Well-known Member

You might also want to see if there is an actual statistical significance difference between the samples using a paired t-test or ANOVA.
A google search will give you info on both.
Thanks, but I think that is too sophisticated and esoteric for my audience -- and maybe for me, too.

JenniferMurphy

Well-known Member
I've done a little more work and come up with what I think are better measures for my purposes. Here's the new table:

Column D contains sample readings that are dangerously high.
Column H contains readings that are ideal.
Columns E-G contain readings that gradually improve from those in D to those in H.

The second set of readings (rows 9-13) contain what I think are the most useful statistics.
The third set (rows 14-15) contain statistics that are also useful.
The last set (rows 16-19) contain statistics that are not that useful for what I need.

AhoyNC

Well-known Member
If you are just looking to present the data and things like standard deviation are too technical for your audience then I would agree that rows 9-13 (plus I would add row 16 the median) would be best suited for your audience. You may want to use a box & whisker graph (which I think are in Excel 2016) which will show how the samples are improving.

6StringJazzer

Well-known Member
I have no knowledge of this domain so for me, I am just looking at numbers. Given that, I have these comments:

The average deviation is probably very useful here, although I am not a statistician.

Column D contains sample readings that are dangerously high.
Column H contains readings that are ideal.
Columns E-G contain readings that gradually improve from those in D to those in H.
If that is the case then these columns should be labelled; at a minimum D and H should be labelled.

I also can't reconcile that description with the comments in rows 5-8.

You said that the goal is to get consistent readings, but the readings in column H, which you say are ideal, do not seem very consistent to me. They are more consistent than the other columns, but there is still a high variability in that the highest reading is more than 50% higher than the lowest reading.

Replies
2
Views
98
Replies
10
Views
985
Replies
0
Views
224
Replies
1
Views
354
Replies
1
Views
663

1,141,060
Messages
5,704,041
Members
421,323
Latest member
Exidous

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back