Hypothesis Testing and Inference
Hypothesis testing and inference is a mechanism in statistics used to determine if a particular claim is statistically significant, that is, statistical evidence exists in favor of or against a given hypothesis. The Statistics package provides 11 commonly used statistical tests, including 7 standard parametric tests and 4 non-parametric tests.
All tests generate a report of all major calculations to userinfo at level 1 (hence, if output is suppressed, the reports are still generated). To access the reports, you need to specify the statistics information level to 1 using the following command.
infolevelStatistics≔1
infolevel[Statistics]:=1
1 Tests for Population Mean
Two standard parametric tests are available to test for a population mean given a sample from that population. The OneSampleZTest should be used whenever the standard deviation of the population is known. If the standard deviation is unknown, the OneSampleTTest should be applied instead.
restart:withStatistics: infolevelStatistics≔1:
Generate a sample from a random variable that represents the sum of two Rayleigh distributions.
R:=RandomVariable⁡Rayleigh⁡7+RandomVariable⁡Rayleigh⁡4:S:=Sample⁡R,100:
The following then are the known values of the mean and standard deviation of the population.
μ:=evalf⁡Mean⁡R
μ:=13.78645551
σ:=evalf⁡StandardDeviation⁡R
σ:=5.281878335
Assuming that we do not know the population mean but we know the standard deviation of the population, test the hypothesis that this sample was drawn from a distribution with mean equal to 12.
OneSampleZTestS,12,σ:
Standard Z-Test on One Sample ----------------------------- Null Hypothesis: Sample drawn from population with mean 12 and known standard deviation 5.28188 Alt. Hypothesis: Sample drawn from population with mean not equal to 12 and known standard deviation 5.28188 Sample size: 100 Sample mean: 13.7517 Distribution: Normal(0,1) Computed statistic: 3.31636 Computed pvalue: 0.000911977 Confidence interval: 12.71643272 .. 14.78689098 (population mean) Result: [Rejected] There exists statistical evidence against the null hypothesis
Similarly, if we assume that the standard deviation is unknown, we can apply the one sample t-test on the same hypothesis - this time with a 90% confidence interval.
OneSampleTTest⁡S,12,confidence=0.9:
Standard T-Test on One Sample ----------------------------- Null Hypothesis: Sample drawn from population with mean 12 Alt. Hypothesis: Sample drawn from population with mean not equal to 12 Sample size: 100 Sample mean: 13.7517 Sample standard dev.: 5.14945 Distribution: StudentT(99) Computed statistic: 3.40165 Computed pvalue: 0.000967459 Confidence interval: 12.89665167 .. 14.60667203 (population mean) Result: [Rejected] There exists statistical evidence against the null hypothesis
2 Tests for the Difference of Two Population Means
Three standard parametric tests are available for testing the difference between two population means when examining two samples. The TwoSampleZTest should be applied when the standard deviation of both populations is known. If the standard deviations are unknown then the TwoSampleTTest is available for unrelated data and the TwoSamplePairedTTest is available for paired data.
restart:withStatistics:infolevelStatistics≔1:
Consider three data sets.
X:=Array⁡9,10,8,4,8,3,0,10,15,9:Y≔Array6,3,10,11,9,8,13,4,4,4:Z≔Array10,11,7,3,10,5,2,12,14,10:
Calculate some known quantities with regards to these samples.
XProp:=table⁡'μ'=Mean⁡X,'σ'=StandardDeviation⁡X
XProp:=tableμ=7.600000000,σ=4.247875286
YProp:=table⁡'μ'=Mean⁡Y,'σ'=StandardDeviation⁡Y
YProp:=tableμ=7.200000000,σ=3.489667288
ZProp:=table⁡'μ'=Mean⁡Z,'σ'=StandardDeviation⁡Z
ZProp:=tableμ=8.400000000,σ=3.977715704
Assuming that we do not know the means of the populations from which X and Y were drawn, but we know the standard deviation of each to be 4 and 3 respectively, test the hypothesis that the difference between the means is 3.
TwoSampleZTest⁡X,Y,3,4,3:
Standard Z-Test on Two Samples ------------------------------ Null Hypothesis: Sample drawn from populations with difference of means equal to 3
Alt. Hypothesis: Sample drawn from population with difference of means not equal to 3 Sample sizes: 10, 10 Sample means: 7.6, 7.2 Difference in means: 0.4 Distribution: Normal(0,1) Computed statistic: -1.64438 Computed pvalue: 0.100097 Confidence interval: -2.698975162 .. 3.498975162 (difference of population means) Result: [Accepted] There is no statistical evidence against the null hypothesis
If we now compare samples X and Z under the hypothesis that the difference in means (Mean(X)-Mean(Z)) is 1, and assume we do not know the standard deviation of either sample, we can apply the two sample t-test.
TwoSampleTTest⁡X,Z,1:
Standard T-Test on Two Samples (Unequal Variances) -------------------------------------------------- Null Hypothesis: Sample drawn from populations with difference of means equal to 1 Alt. Hypothesis: Sample drawn from population with difference of means not equal to 1 Sample sizes: 10, 10 Sample means: 7.6, 8.4 Sample standard devs.: 4.24788, 3.97772 Difference in means: -0.8 Distribution: StudentT(17.92283210) Computed statistic: -0.978107 Computed pvalue: 0.34104 Confidence interval: -4.667499017 .. 3.067499017 (difference of population means) Result: [Accepted] There is no statistical evidence against the null hypothesis
If we instead drew the data for X and Z from paired sampling, we can apply the two sample t-test for paired data.
TwoSamplePairedTTest⁡X,Z,1:
Standard T-Test with Paired Samples ----------------------------------- Null Hypothesis: Sample drawn from populations with difference of means equal to 1 Alt. Hypothesis: Sample drawn from population with difference of means not equal to 1 Sample size: 10 Difference in means: -0.8 Difference std. dev.: 1.31656 Distribution: StudentT(9) Computed statistic: -4.32346
Computed pvalue: 0.00192341 Confidence interval: -1.741810891 .. .1418108907 (difference of population means) Result: [Rejected] There exists statistical evidence against the null hypothesis
3 Tests for Population Variance / Standard Deviation
Two standard parametric tests are available for examining hypotheses regarding the population variance and standard deviation using the variance ratio. The OneSampleChiSquareTest function should be applied when comparing a sample standard deviation against an assumed population standard deviation. When comparing the variances of two independent samples for a specific ratio, the TwoSampleFTest function should be used instead.
Generate a sample from a Maxwell distribution and an Exponential distribution.
S:=SampleMaxwell⁡3,100:T:=Sample⁡Exponential⁡2,100:
The following then are the known values of the variances of each population.
S_sigma:=evalf⁡StandardDeviation⁡Maxwell⁡3
S_sigma:=2.020318836
T_sigma:=evalf⁡StandardDeviation⁡Exponential⁡2
T_sigma:=2.
Consider the hypothesis that S is drawn from a sample with a standard deviation of 4 and apply the OneSampleChiSquareTest.
OneSampleChiSquareTest⁡S,2:
Chi-Square Test on One Sample ----------------------------- Null Hypothesis: Sample drawn from population with standard deviation equal to 2 Alt. Hypothesis: Sample drawn from population with standard deviation not equal to 2 Sample size: 100 Sample standard dev.: 1.83342
Distribution: ChiSquare(99) Computed statistic: 83.1952 Computed pvalue: 0.253798 Confidence interval: 1.609754032 .. 2.129836954 (population standard deviation) Result: [Accepted] There is no statistical evidence against the null hypothesis
Now consider the hypothesis that samples S and T were drawn from populations that had a variance ratio of 2. The TwoSampleFTest compares the ratio of S and T against an assumed variance ratio of the populations. Thus, if we were to instead test that the samples had the same variance ratio, we would use an assume ratio of 1 instead.
TwoSampleFTest⁡S,T,2:
F-Ratio Test on Two Samples --------------------------- Null Hypothesis: Sample drawn from populations with ratio of variances equal to 2 Alt. Hypothesis: Sample drawn from population with ratio of variances not equal to 2 Sample sizes: 100, 100 Sample variances: 3.36142, 4.08274 Ratio of variances: 0.823326 Distribution: FRatio(99,99) Computed statistic: 0.411663 Computed pvalue: 1.45561e-05 Confidence interval: .5539687377 .. 1.223654982 (ratio of population variances) Result: [Rejected] There exists statistical evidence against the null hypothesis
4 Tests for Normality
The Statistics package provides an implementation of Shapiro and Wilk's W-test for normality. This test is used to determine if a provided sample could be considered to be drawn from a normal distribution.
Generate a sample of twenty points from a normal distribution and another from a uniform distribution.
S:=SampleNormal⁡5,2,20:T:=SampleUniform⁡3,7,20:
Consider the hypothesis that S is drawn from a normal distribution and apply Shapiro and Wilk's W-test.
ShapiroWilkWTest⁡S:
Shapiro and Wilk's W-Test for Normality --------------------------------------- Null Hypothesis: Sample drawn from a population that follows a normal distribution Alt. Hypothesis: Sample drawn from population that does not follow a normal distribution Sample size: 20 Computed statistic: 0.972002 Computed pvalue: 0.784909 Result: [Accepted] There is no statistical evidence against the null hypothesis
Apply the same hypothesis with regards to the data drawn from the uniform distribution.
ShapiroWilkWTest⁡T:
Shapiro and Wilk's W-Test for Normality --------------------------------------- Null Hypothesis: Sample drawn from a population that follows a normal distribution Alt. Hypothesis: Sample drawn from population that does not follow a normal distribution Sample size: 20 Computed statistic: 0.889151 Computed pvalue: 0.0259262 Result: [Rejected] There exists statistical evidence against the null hypothesis
5 Tests for Goodness-of-Fit
The Statistics package provides two methods of testing goodness-of-fit. The ChiSquareGoodnessOfFitTest function should be used to determine if an observed or empirical data set fits expected values for that data set. Similarly, the ChiSquareSuitableModelTest is available for testing how well a given probability distribution approximates a data sample.
Consider the following number of sales made on each day of the week at a jewelry store, tallied over one sales week (Monday to Saturday).
Ob:=Array⁡25,17,15,23,24,16
We wish to test the hypothesis that sales are uniformly distributed throughout the week. The expected number of sales per day is then given by the number of sales averaged over the week.
SalesPerDay:=∑i=16'Ob'i6
SalesPerDay:=20
Ex:=Array⁡SalesPerDay$6
We now test the hypothesis (using ChiSquareGoodnessOfFitTest) that the observed number of sales per day is consistent with a uniformly distributed number of sales each day.
ChiSquareGoodnessOfFitTest⁡Ob,Ex,level=0.05:
Chi-Square Test for Goodness-of-Fit ----------------------------------- Null Hypothesis: Observed sample does not differ from expected sample Alt. Hypothesis: Observed sample differs from expected sample Categories: 6 Distribution: ChiSquare(5) Computed statistic: 5 Computed pvalue: 0.41588 Critical value: 11.07049741 Result: [Accepted] There is no statistical evidence against the null hypothesis
Hence we conclude that a uniformly distributed number of sales is a reasonable claim.
Consider a dataset of times during a day when sales are made. Determine if sales are uniformly distributed during the day (consider an 8 hour working day where sales are measured between 0.0 and 8.0, the number of hours into the day). The data in this case is continuous and we are testing against a uniform probability distribution.
SaleTimes:=1.4,1.8,2.2,2.9,3.0,3.4,3.4,3.5,3.6,3.7,3.8,4.0,4.4,4.6,5.3,7.5:
Apply the chi square suitable model test to determine if a uniform distribution closely matches the provided data.
ChiSquareSuitableModelTest⁡SaleTimes,Uniform⁡0,8:
Chi-Square Test for Suitable Probability Model
---------------------------------------------- Null Hypothesis: Sample was drawn from specified probability distribution Alt. Hypothesis: Sample was not drawn from specified probability distribution Bins: 4 Distribution: ChiSquare(3) Computed statistic: 9.5191 Computed pvalue: 0.023129 Critical value: 7.814728288 Result: [Rejected] There exists statistical evidence against the null hypothesis
Hence we conclude that the sale times are not uniformly distributed throughout the day. Closer examination of the data reveals that most of the sales were made roughly half way through the day.
6 Tests for Independence in a Two-Way Table
The Statistics package contains the ChiSquareIndependenceTest function, which is used to determine if two attributes are independent of one another.
Consider a sample of 476 patients that are part of a survey to determine if a new drug is effective at fighting a new disease. Patients are randomly given either the new drug or a placebo, and their recovery rate is tabulated as follows:
DrugGroup≔Vectorcolumn64,176:# Recovered, Not RecoveredPlaceboGroup≔Vectorcolumn86,150: # Recovered, Not Recovered
Construct the two-way table for this result.
Output ≔ MatrixDrugGroup, PlaceboGroup:
Finally, apply the chi square test for independence to test the hypothesis that the results are independent. That is, the drug has no effect on the recovery rate from the disease.
ChiSquareIndependenceTestOutput:
Chi-Square Test for Independence -------------------------------- Null Hypothesis: Two attributes within a population are independent of one another Alt. Hypothesis: Two attributes within a population are not independent of one another Dimensions: 2 Total Elements: 476
Distribution: ChiSquare(1) Computed statistic: 5.26704 Computed pvalue: 0.0217328 Critical value: 3.84145606580278 Result: [Rejected] There exists statistical evidence against the null hypothesis
Thus we conclude that there exists statistical evidence in favor of the drug having an effect on recovery rate. Closer examination reveals that the drug improves a patient's chance of recovery from the disease.
7 Output Options
The default output from each test is a report containing expressions of the form name = value for key output from the test. Using the output option, specific values can be returned instead.
restart:withStatistics:
Consider the following data set.
X:=Array⁡9,10,8,4,8,3,0,10,15,9:
Apply the one sample t-test on this data to test for a population mean of 5:
OneSampleTTest⁡X,5
hypothesis=true,confidenceinterval=4.561253851..10.63874615,distribution=StudentT⁡9,pvalue=0.08491508130,statistic=1.935537501
A true value for the hypothesis indicates that there is no statistical evidence against the null hypothesis (and there exists statistical evidence against it otherwise). If we were only interested in the confidence interval from this calculation, we can use option output=confidenceinterval.
OneSampleTTest⁡X,5,output=confidenceinterval
4.561253851..10.63874615
A list of valid output options are available on the help page for each test.
Return to Index for Example Worksheets
Download Help Document