Need advice on DATA analysis or something similiar to that..

pedie

Well-known Member
Joined
Apr 28, 2010
Messages
3,875
I have a data for last 12 months for each business day [Col A = dates ] [Col B = Volume of Clients in bank] now i want to get the probalibility of how many customers might be present/volume of clients visiting us?

How is this study done in excel?
Please kindy help. Thanks for helping in advance.

:)
 

Some videos you may like

Excel Facts

Pivot Table Drill Down
Double-click any number in a pivot table to create a new report showing all detail rows that make up that number

energman58

Well-known Member
Joined
Oct 25, 2010
Messages
553
Two approaches you might try:

Use your data as is and run Rank and Percentile from the data analysis and then use the nth largest or smallest to describe your data - 95% of days we have more than x visits or whatever

or

Use Descriptive statistics from teh data analysis pack or the worksheet functions MEAN and STDEV to get the mean and standard deviations of your data and then assume your data is normally distributed (I think Descriptive Statistics gets you Kurtosos and Skewness measures so you can see how good/bad an assumption that is - Kurtosis is a measure of how "fat" or spiky a distribution is compared to the normal distribution while Skewness measures the degree of "meat" in each side of the distribution - positive skewness means the mean is larger than you would expect in a normal distribution).

You can then use teh worksheet function ZTEST to determine the probability of any sample (day) having more than X visits.

I think I would look at the days of the week too to see if they behave differently - you can test for that too using CHI-squared
 

pedie

Well-known Member
Joined
Apr 28, 2010
Messages
3,875
Energman58, thanks alot for advice. Very helping, i've heard of this term but never really used any of this things to come up with the answer...

Could you please show me small example?
By Percentile do you mean percentage similiar thing? See 'm not good and very new to this....

I also have heard bout ZTEST, Skewness, Kurtosos etc but only in terms and have never really used, so if you can post back with small exmaple to support both ways that'd be really great.


Thanks again.
Two approaches you might try:

Use your data as is and run Rank and Percentile from the data analysis and then use the nth largest or smallest to describe your data - 95% of days we have more than x visits or whatever

or

Use Descriptive statistics from teh data analysis pack or the worksheet functions MEAN and STDEV to get the mean and standard deviations of your data and then assume your data is normally distributed (I think Descriptive Statistics gets you Kurtosos and Skewness measures so you can see how good/bad an assumption that is - Kurtosis is a measure of how "fat" or spiky a distribution is compared to the normal distribution while Skewness measures the degree of "meat" in each side of the distribution - positive skewness means the mean is larger than you would expect in a normal distribution).

You can then use teh worksheet function ZTEST to determine the probability of any sample (day) having more than X visits.

I think I would look at the days of the week too to see if they behave differently - you can test for that too using CHI-squared

 

energman58

Well-known Member
Joined
Oct 25, 2010
Messages
553
This is the result of running Descriptive Statistics from some data I made up

<TABLE style="WIDTH: 116pt; BORDER-COLLAPSE: collapse" border=0 cellSpacing=0 cellPadding=0 width=154><COLGROUP><COL style="WIDTH: 68pt; mso-width-source: userset; mso-width-alt: 3291" width=90><COL style="WIDTH: 48pt" width=64><TBODY><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; WIDTH: 116pt; HEIGHT: 15pt; BORDER-TOP: windowtext 1pt solid; BORDER-RIGHT: #f0f0f0; mso-ignore: colspan" class=xl65 height=20 width=154 colSpan=2 align=middle>Column1</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" height=20></TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0"></TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" height=20 align=left>Mean</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>4.785714</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" height=20 align=left>Standard Error</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>0.621754</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" height=20 align=left>Median</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>4.5</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" height=20 align=left>Mode</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>2</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" height=20 align=left>Standard Deviation</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>2.326389</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" height=20 align=left>Sample Variance</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>5.412088</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" height=20 align=left>Kurtosis</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>-1.02413</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" height=20 align=left>Skewness</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>0.38752</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" height=20 align=left>Range</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>7</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" height=20 align=left>Minimum</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>2</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" height=20 align=left>Maximum</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>9</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" height=20 align=left>Sum</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>67</TD></TR><TR style="HEIGHT: 15.75pt" height=21><TD style="BORDER-BOTTOM: windowtext 1pt solid; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; HEIGHT: 15.75pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" class=xl64 height=21 align=left>Count</TD><TD style="BORDER-BOTTOM: windowtext 1pt solid; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" class=xl64 align=right>14</TD></TR></TBODY></TABLE>

Just click on data analysis tools and select descriptive statistics & select your data - you need to check the descriptive statistics box to get the above table

This describes your data

You can then use various statistical functions in Excel to determine things you might be interested in

Say your mean for visits in a day is 95 and teh Standard deviation is 6.2

If you wanted teh probability that you would get fewer than 100 visists in any day you would use

NORMDIST(100,95,6.2,TRUE) which gives you 0.79 - there is a 79% probability that you wil get fewer than 100 visits on any day or a 1-0.79 (21%) probaility of getting more than 100 and so on

You can also do a CHI Squared test to look at the hypothesis that all days are the same in this case you compute a CHI Squared statistic

you make a range which looks like this with the average visits for each day of the week compared with what you expect (in this case that 1/5 of visits happen on each day):

<TABLE style="WIDTH: 288pt; BORDER-COLLAPSE: collapse" border=0 cellSpacing=0 cellPadding=0 width=384><COLGROUP><COL style="WIDTH: 48pt" span=6 width=64><TBODY><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; WIDTH: 48pt; HEIGHT: 15pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" height=20 width=64></TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; WIDTH: 48pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" width=64 align=left>mon</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; WIDTH: 48pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" width=64 align=left>tues</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; WIDTH: 48pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" width=64 align=left>wed</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; WIDTH: 48pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" width=64 align=left>thur</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; WIDTH: 48pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" width=64 align=left>fri</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" height=20 align=left>Actual</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>95</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>99</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>100</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>95</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>95</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" height=20 align=left>Expected</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>96.8</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>96.8</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>96.8</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>96.8</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>96.8</TD></TR></TBODY></TABLE>

Use the function CHITEST for this - it is 0.99

This gives you a number which is teh probability that actual deviated from the expected by chance - in this case there is a high probability (99%) that the values deviate by chance

But if you change the actual value for Wednesdays in my example to say 200 the CHITEST gives you a very small number (2.96E-15) which says there is a very small probability that this is chance and not related to the day of the week

Depending on what values you get you might want to look at each day separately and look at the probabilities for each day - maybe you get more visits on a monday or something so you might want to say "30% of visits happen on Monday" or "on 95% of Mondays we get more than 130 visits" or something like that.

You could do similar analysis for different weeks or quarters - maybe the first week of the month has more activity or something. In each case you need to set up your data that gives you what you would expect if the visits were random so you could look at 1/12th in each month compared to the actuals and run your CHI SQUARED test


As a first step I would plot all your data out to see if there is a pattern to it - weekly or monthly or whatever before testing it
 

pedie

Well-known Member
Joined
Apr 28, 2010
Messages
3,875
Thank you soo much for detailed explaination and helping out...
'll go through today and try to get as much as possible and impress my bose:biggrin:

Thank you soo much, i really appriciate your help.

If someone can also add something to energyman then pls do.

Thanks again, i love this forum:)
 

Watch MrExcel Video

Forum statistics

Threads
1,123,158
Messages
5,600,054
Members
414,357
Latest member
Gemma_R

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top