# Need advice on DATA analysis or something similiar to that..

#### pedie

##### Well-known Member
I have a data for last 12 months for each business day [Col A = dates ] [Col B = Volume of Clients in bank] now i want to get the probalibility of how many customers might be present/volume of clients visiting us?

How is this study done in excel?

#### energman58

##### Well-known Member
Two approaches you might try:

Use your data as is and run Rank and Percentile from the data analysis and then use the nth largest or smallest to describe your data - 95% of days we have more than x visits or whatever

or

Use Descriptive statistics from teh data analysis pack or the worksheet functions MEAN and STDEV to get the mean and standard deviations of your data and then assume your data is normally distributed (I think Descriptive Statistics gets you Kurtosos and Skewness measures so you can see how good/bad an assumption that is - Kurtosis is a measure of how "fat" or spiky a distribution is compared to the normal distribution while Skewness measures the degree of "meat" in each side of the distribution - positive skewness means the mean is larger than you would expect in a normal distribution).

You can then use teh worksheet function ZTEST to determine the probability of any sample (day) having more than X visits.

I think I would look at the days of the week too to see if they behave differently - you can test for that too using CHI-squared

#### pedie

##### Well-known Member
Energman58, thanks alot for advice. Very helping, i've heard of this term but never really used any of this things to come up with the answer...

Could you please show me small example?
By Percentile do you mean percentage similiar thing? See 'm not good and very new to this....

I also have heard bout ZTEST, Skewness, Kurtosos etc but only in terms and have never really used, so if you can post back with small exmaple to support both ways that'd be really great.

Thanks again.
#### energman58

##### Well-known Member
This is the result of running Descriptive Statistics from some data I made up

Mean 4.785714
Standard Error 0.621754
Median 4.5 Mode 2
Standard Deviation 2.326389
Sample Variance Kurtosis -1.02413
Skewness 0.38752
Range Minimum 2
Maximum 9
Sum Count 14 

Just click on data analysis tools and select descriptive statistics & select your data - you need to check the descriptive statistics box to get the above table

You can then use various statistical functions in Excel to determine things you might be interested in

Say your mean for visits in a day is 95 and teh Standard deviation is 6.2

If you wanted teh probability that you would get fewer than 100 visists in any day you would use

NORMDIST(100,95,6.2,TRUE) which gives you 0.79 - there is a 79% probability that you wil get fewer than 100 visits on any day or a 1-0.79 (21%) probaility of getting more than 100 and so on

You can also do a CHI Squared test to look at the hypothesis that all days are the same in this case you compute a CHI Squared statistic

you make a range which looks like this with the average visits for each day of the week compared with what you expect (in this case that 1/5 of visits happen on each day):

<TABLE style="WIDTH: 288pt; BORDER-COLLAPSE: collapse" border=0 cellSpacing=0 cellPadding=0 width=384><COLGROUP><COL style="WIDTH: 48pt" span=6 width=64><TBODY><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; WIDTH: 48pt; HEIGHT: 15pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" height=20 width=64></TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; WIDTH: 48pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" width=64 align=left>mon</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; WIDTH: 48pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" width=64 align=left>tues</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; WIDTH: 48pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" width=64 align=left>wed</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; WIDTH: 48pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" width=64 align=left>thur</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; WIDTH: 48pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" width=64 align=left>fri</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" height=20 align=left>Actual</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>95</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>99</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>100</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>95</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>95</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" height=20 align=left>Expected</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>96.8</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>96.8</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>96.8</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>96.8</TD><TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" align=right>96.8</TD></TR></TBODY></TABLE>

Use the function CHITEST for this - it is 0.99

This gives you a number which is teh probability that actual deviated from the expected by chance - in this case there is a high probability (99%) that the values deviate by chance

But if you change the actual value for Wednesdays in my example to say 200 the CHITEST gives you a very small number (2.96E-15) which says there is a very small probability that this is chance and not related to the day of the week

Depending on what values you get you might want to look at each day separately and look at the probabilities for each day - maybe you get more visits on a monday or something so you might want to say "30% of visits happen on Monday" or "on 95% of Mondays we get more than 130 visits" or something like that.

You could do similar analysis for different weeks or quarters - maybe the first week of the month has more activity or something. In each case you need to set up your data that gives you what you would expect if the visits were random so you could look at 1/12th in each month compared to the actuals and run your CHI SQUARED test

As a first step I would plot all your data out to see if there is a pattern to it - weekly or monthly or whatever before testing it

#### pedie

##### Well-known Member
Thank you soo much for detailed explaination and helping out...
'll go through today and try to get as much as possible and impress my bose

Thank you soo much, i really appriciate your help.

If someone can also add something to energyman then pls do.

Thanks again, i love this forum

