Statistics: Understanding and Exploring Assumptions

It is very vital in statistics to ensure that statistics test assumptions are met, different test statistics have diffeassumptionsption wish are based on the test being carried out. The assumptions ascertain that the result obtaobtained isnot biased in any way, this ensures the tests are reliable, objective, and accurate. They also allow the tests to be carried out in controlled experiments, facilitating comparison and the tests they to be carried out, as it is impossible to consider all variables involved in a test objectively (Weinberg, 2008, p. 690). Failure to adhere to the assumption renders the test to be bias, inaccurate, and not being objective toward the experiment being carried out.

Statistics: Understanding and Exploring Assumptions

Statistics: Understanding and Exploring Assumptions

Statistics: Understanding and Exploring Assumptions

Statistics: Understanding and Exploring Assumptions

Statistics: Understanding and Exploring Assumptions

Day one hygiene is normally distributed from the bell shape of the normal curve plotted in the histogram. Also from the p-p curvet, the expected cumulative probability is almost equal to the observed cumulative probability, thus concluding the data set is normally distributed.

Day two hygiene data set is not normally distributed as the normal curve plotted in the histogram is skewed to the right. Also from the p-p curvet, the expected cumulative probability varies a lot from the observed cumulative probability.

Day three hygiene data set is not normally distributed as the normal curve plotted in the histogram is skewed to the right. Also from the p-p curve, the expected cumulative probability varies a lot from the observed cumulative probability.

Standard Descriptive Statistics for Hygiene Data Set.

Hygiene (Day 1 of Download Festival) Hygiene (Day 2 of Download Festival) Hygiene (Day 3 of Download Festival)
N Valid 810 264 123
Missing 0 546 687
Mean 1.7934 .9609 .9765
Std. Error of Mean .03319 .04436 .06404
Median 1.7900 .7900 .7600
Mode 2.00 .23 .44(a)
Std. Deviation .94449 .72078 .71028
Variance .892 .520 .504
Skewness 8.865 1.095 1.033
Std. Error of Skewness .086 .150 .218
Kurtosis 170.450 .822 .732
Std. Error of Kurtosis .172 .299 .433
Range 20.00 3.44 3.39

Day one hygiene computation results are mean =1.7934, median =1.7900, and mode =2.0, the descriptive statistic are approximately all equal Skewness is computed to be 8.865 implying the data set is asymmetric. Kurtosis is computed to be 170.450 implying a possibility of leptokurtic distribution. There is large variation in the data set; range = 20 and variance =0.892. From the computational result of the data set, normal distribution assumptions are not met, hence the data is not normally distributed

Day two hygiene data has a large variation between the mode, me, and, theme values (mean =0.9609, median =0.7900, mode =0.23). Skewness is computed to be 1.095 implying it is approximately symmetric. Kurtosis is computed to be 0.822 implying a possibility of mesokurtic distribution. There is a small l variation in the data set; range = 3.44 and variance =0.520. From the computational result of the data set normal distribution assumptions are met, hence the data is normally distributed

Day three hygiene data set has a large variation among the mode, mean and Indian values (mean =0.9765, mode =0.7600, median =0.44). Skewness is computed to be 1.033 implying the data set is approximately symmetric. Kurtosis is computed to be 0.732 implying a possibility of mesokurtic distribution. There is a small variation in the data set; range =3.39 and variance =0.504. From the computational result of the data set, normal distribution aassumptionsnsre met, hence concluding the data is normally distributed.

The results from computation are opposite of the observed results, this shows the subjective nature of results derived from observation

Standard Descriptive Statistics for Exam Data.

Percentage on SPSS exam Computer literacy Percentage of lectures attended Numeracy
N Valid 100 100 100 100
Missing 0 0 0 0
Mean 58.10 50.71 59.765 4.85
Median 60.50(a) 51.43(a) 62.000(a) 4.37(a)
Mode 72(b) 54 48.5(b) 4
Std. Deviation 21.316 8.260 21.6848 2.706
Variance 454.354 68.228 470.230 7.321
Skewness -.107 -.174 -.422 .961
Std. Error of Skewness .241 .241 .241 .241
Kurtosis -1.105 .364 -.179 .946
Std. Error of Kurtosis .478 .478 .478 .478
Range 84 46 92.0 13
University Statistic Std. Error
Numeracy Dunce town University Mean 4.12 .292
95% Confidence Interval for Mean Lower Bound 3.53
Upper Bound 4.71
5% Trimmed Mean 4.06
Median 4.00
Variance 4.271
Std. Deviation 2.067
Minimum 1
Maximum 9
Range 8
Inter-quartile Range 3
Skewness .512 .337
Kurtosis -.484 .662
Sussex University Mean 5.58 .434
95% Confidence Interval for Mean Lower Bound 4.71
Upper Bound 6.45
5% Trimmed Mean 5.38
Median 5.00
Variance 9.432
Std. Deviation 3.071
Minimum 1
Maximum 14
Range 13
Inter-quartile Range 5
Skewness .793 .337
Kurtosis .260 .662
Percentage of lectures attended Dunce town University Mean 56.260 3.3619
95% Confidence Interval for Mean Lower Bound 49.504
Upper Bound 63.016
5% Trimmed Mean 56.544
Median 60.500
Variance 565.135
Std. Deviation 23.7726
Minimum 8.0
Maximum 100.0
Range 92.0
Inter-quartile Range 28.8
Skewness -.309 .337
Kurtosis -.383 .662
Sussex University Mean 63.270 2.6827
95% Confidence Interval for Mean Lower Bound 57.879
Upper Bound 68.661
5% Trimmed Mean 63.672
Median 65.750
Variance 359.849
Std. Deviation 18.9697
Minimum 12.5
Maximum 100.0
Range 87.5
Inter-quartile Range 30.0
Skewness -.365 .337
Kurtosis -.221 .662
Computer literacy Dunce town University Mean 50.26 1.141
95% Confidence Interval for Mean Lower Bound 47.97
Upper Bound 52.55
5% Trimmed Mean 50.12
Median 49.00
Variance 65.094
Std. Deviation 8.068
Minimum 35
Maximum 67
Range 32
Inter-quartile Range 12
Skewness .225 .337
Kurtosis -.515 .662
Sussex University Mean 51.16 1.203
95% Confidence Interval for Mean Lower Bound 48.74
Upper Bound 53.58
5% Trimmed Mean 51.36
Median 54.00
Variance 72.341
Std. Deviation 8.505
Minimum 27
Maximum 73
Range 46
Inter-quartile Range 9
Skewness -.538 .337
Kurtosis 1.379 .662
Percentage on SPSS exam Dunce town University Mean 40.18 1.780
95% Confidence Interval for Mean Lower Bound 36.60
Upper Bound 43.76
5% Trimmed Mean 40.06
Median 38.00
Variance 158.477
Std. Deviation 12.589
Minimum 15
Maximum 66
Range 51
Inter-quartile Range 18
Skewness .309 .337
Kurtosis -.567 .662
Sussex University Mean 76.02 1.443
95% Confidence Interval for Mean Lower Bound 73.12
Upper Bound 78.92
5% Trimmed Mean 75.86
Median 75.00
Variance 104.142
Std. Deviation 10.205
Minimum 56
Maximum 99
Range 43
Inter-quartile Range 12
Skewness .272 .337
Kurtosis -.264 .662

Statistics: Understanding and Exploring Assumptions

Statistics: Understanding and Exploring Assumptions

The descriptive statistics from the computation of the data set for the computer literacy, percentage of lectures attended and percentage on spss exam have met the normal distribution assumptions. Based on the results of skewness is approximately close to zero, kurtosis is in the range of -3<k<3, and means, mode, and median are approximately equal. The numeracy data set do not meet the normal distribution assumption hence not normally distributed, based on skewness, kurtosis, and central tendency descriptions from computation (Landau, 2004, p. 272).

Levene’s Test for Equality of Variances t-test for Equality of Means
F Sig. t df Sig. (2-tailed) Mean Difference Std. Error Difference 95% Confidence Interval of the Difference
Lower Upper
Computer literacy Equal variances assumed .064 .801 -.543 98 .588 -.900 1.658 -4.190 2.390
Equal variances not assumed -.543 97.728 .588 -.900 1.658 -4.190 2.390
Percentage of lectures attended Equal variances assumed 1.731 .191 -1.630 98 .106 -7.0100 4.3011 -15.5454 1.5254
Equal variances not assumed -1.630 93.400 .107 -7.0100 4.3011 -15.5507 1.5307

The Levene test significance value is 0.801 for computer literacy, hence the result for assumed equal variance for the groups is used to analyze the results. The t-test significance value is -0.543 and the mean difference confidence interval does not contain a zero, this indicates a significant difference between the two universities. This leads to the conclusion that computer literacy between Duncetown University and Sussex University is significantly different.

The Levene test significance value for lectures attended percentage is 0.191, hence the result for assumed equal variance for the groups is used to analyze the results. The mean difference confidence interval does not contain a zero; this indicates a significant difference between the two universities. This leads to the conclusion that the percentage of the lectures attended between Duncetown University and Sussex University are significantly different.

The assumption of normality assumes that the mean of different samples from the same population are equal and also equal to the mean of the population itself. Homogeneity of variance assumes that when comparing the variance of two populations based on their mean, the variance should be equal or variability between variables in two different populations is the same (Weinberg, 2008, p. 553). If the assumption of the homogeneity of variance is violated, a statistician may turn to other inference drawing techniques about the population mean comparison without impacting on the intended result. Ranked data technique may be used in case normality assumption is not observed without any impact on the desired result (Landau, 2004, p. 301). Degree of freedom correction methods may be used in case assumptions of variance homogeneity are violated without impact on desired results. Alternatively, the course of action is simply written “Proceed with caution”, this declares the statistician proceeds with t-test with caution as the assumption is violated.

Reference List

Landau, S. (2004). A Handbook of Statistical Analyses Using SPSS. New York. Barnes & Noble.

Weinberg, S. (2008). Statistics using SPSS: An Integrative Approach. California. Wadsworth Publishing.