Before we apply hypothesis testing we need to decide how strong the evidence against the null hypothesis must be if we are to reject it. A common rule is that when the probability is less than 5% (i.e.,p < .05) that a sample mean drawn from the null population would be as large as the obtained sample mean, we conclude that the sample did not come from the null population. This probability criterion (e.g., .05) is called alpha (α). If the probability (i.e., p-value) is less than alpha that we would obtain a sample mean this large or larger from the null population, we reject the null hypothesis and conclude that that our sample was drawn from a different population with a sample mean larger than the null mean.
If the p-value is greater than alpha (e.g., .083), we conclude that we do not have sufficient evidence to reject the null hypothesis. In this case we “fail to reject the null hypothesis” because we expect to observe a mean this large or larger more than 5% of the time when we sample from the null distribution. Failure to reject the null hypothesis does not mean that we necessarily have evidence that the null hypothesis is true; rather, our findings are ambiguous.
The alpha criterion can be chosen by the researcher to be greater or less than 5% depending upon how costly it would be to reject the null hypothesis if it really is true. If this error is costly, we may set alpha to a smaller value such as 1% (α = .01) instead of 5%. If this error is less costly we may use a larger value of alpha such as 10% (α = .10).
The alpha level for the decision criterion must be set before we look at the data. After collecting the data, we compare the obtained p-value with alpha. To reject the null hypothesis, the p-value must be less than alpha.
In our example, if we obtain a sample mean of 550, the p-value is the probability of observing a mean as large or larger than 550 if the population mean really is only 500. The p-value is not the probability that the null hypothesis is true. In fact, the probability that the population mean for the ACE graduates is EXACTLY equal to the mean for the non-graduates is essentially zero.
Question G: Making a Conclusion Based on Alpha
Suppose a random sample of 10 students who completed the ACE training program scored an average of 560 on the VAST test. The sample mean of 560 corresponds to a z-score of 1.9. Given the null hypothesis, H0: m ≤ 500, and an alpha level of α = .05, what do you conclude? (Hint: You can use a z-table or the p–z converter for this problem.)
The population mean for the program may be 500 or less (fail to reject the null hypothesis).
This answer is incorrect! You are claiming that we can’t reject the assumption that the population mean score for individuals from the training program is 500 or less. Our z-score of 1.9 (right-tail p = .029) indicates that the sample mean of 560 would be surprising if the population mean for the training program was truly 500. The one-tailed p value of .029 is less than our alpha level of .05; thus we have met our criterion for rejecting the null hypothesis.
The population mean for the program is not likely to be 500 or less (reject the null hypothesis).
This answer is correct! You are claiming that it is unlikely that the mean score for individuals from the training program is 500 or less. The z-score of 1.90 (p = .029) indicates that the sample mean of 560 would be surprising if the population mean for the training program was truly 500. Thus, we reject our null hypothesis, H0: m ≤ 500, meaning that we reject the assumption that the population of training program graduates have a mean value of 500 or less on the VAST. Our sample result is unlikely to come from a population with a mean of 500 and it is even more unlikely that the population mean is less than 500.
When we reject the null hypothesis, we reject it in favor of another explanation. This explanation is termed the alternative hypothesis. The alternative hypothesis generally states the opposite of the null hypothesis (e.g., states that the null is not true). In this example, H1: m > 500, and we conclude that the population mean for ACE graduates is greater than 500.
Notice that if we had used a stricter alpha of .01 instead of .05, we would fail to reject the null hypothesis because our obtained p-value of .029 would then be greater than alpha. With this more conservative criterion we would have to conclude that we do not have sufficient evidence to reject the hypothesis that the mean VAST score may be 500 or less for the population of graduates from the ACE training program.
Sample mean is 500.
This answer is incorrect for several reasons. First, hypotheses always refer to population values. We use samples to draw inferences about populations. Second, we know this statement is false because our sample mean is 560, not 500 – we don’t need any fancy statistical test to draw that conclusion.
Population mean is 560.
This answer is incorrect. Although the sample mean is 560, we cannot be confident that the actual population mean is 560.