The null hypothesis states that graduates of ACE training do not have larger average test scores than test takers without ACE training. Now suppose that there is a treatment effect such that training does actually improve scores by 50 points on average.
Question: In the matrix below, which decision regarding the null hypothesis is correct and which is an error?
Actual Situation: |
||
There is an effect | ||
Decision: |
Reject the null | ? |
Fail to reject the null | ? |
When there is an actual treatment effect and we reject the null hypothesis, we have made a correct decision (top purple cell). On the other hand, when there is actually a treatment effect but we fail to reject the null hypothesis, we have made an incorrect decision (bottom purple cell). Failing to reject the null hypothesis when it is false is called a Type 2 error. The probability of making a Type 2 error when the null is false is called beta, β. Thus, the probability of rejecting the null and making the correct decision when there is an effect is 1 – β, called the power of the test.
Null and Alternative Distributions
Recall that the critical value for rejecting the null hypothesis is based upon the null population distribution and alpha, and that this value is not influenced by the alternative distribution. For instance, using an alpha of α = .05 and drawing a sample size of N = 5 for VAST test scores (population mean of 500 and population standard deviation of 100), we calculated the critical value to be 573.56 (marked by the dotted red line in the figure below). If our sample mean based on N = 5 exceeds this value, we reject the null hypothesis. Otherwise, we fail to reject the null.
- Below is the null distribution in blue, with the critical value marked by the dotted red line.
- Notice the alternative distribution with a mean of 550, and corresponding areas of Type 2 error (in red) and power (in pink).
Typically we do not know what the mean for the alternative distribution is, but for illustrative purposes, we assume that the graduates of the training program average 50 points greater than others, and thus the mean for the alternative distribution is 550.
We can now update the decision matrix by labeling the cells and inserting their probabilities for the situation where the null hypothesis is actually false and the alternative hypothesis is true:
Actual Situation: |
||
Null is false (effect exists) |
||
Decision: |
Reject the null (effect supported) |
Correct Decision (probability = 1 – β) |
Fail to reject the null (effect not supported) |
Type 2 Error (probability = β) |