Here are some other things to consider when conducting hypothesis testing:

#### 1. Design Issues

Correct interpretation of statistical findings depends greatly on issues of design and measurement. For example, if we compare test performance for two groups of students where one group elected to take a training course while the other group decided not to take the course, we cannot be sure that any observed difference between the two groups is due to training. People who choose to take a training course may differ from those who do not make that choice. Bad design may make a research study worthless.

#### 2. Specificity and Sensitivity (i.e., Power)

**Specificity** is the probability that we fail to reject the null hypothesis, *H*_{0}, when it is true; whereas**sensitivity** is the probability that we do reject *H*_{0} when it is false. Sensitivity is also known as statistical power. Because beta error, *β*, is failing to reject *H*_{0} when it is false, power is one minus beta (1 – *β*). One way to increase power in a study is to increase sample size. To explore power, you may want to complete the WISE Power Tutorial.

#### 3. Statistical Significance vs. Practical Significance

Related to design issues and sensitivity, is the impact of sample size on interpreting null hypothesis testing results. Often we aim to sample a large number of cases in order to increase the power of our study (i.e., to enhance our ability to detect an existing effect). However, with bigger sample sizes, we are more likely to declare trivial differences in sample means as being “statistically significant” even though the actual effect, practically speaking, may be small or not important. Statistically significant findings are not necessarily of practical significance.

#### 4. Criticisms of Null Hypothesis Testing

Contrary to a common misconception, the *p*-value that we obtain when conducting a test of a null hypothesis is not the probability of the null hypothesis being false. Instead, it is the probability that we would obtain a sample mean so far or farther from the null hypothesized mean if the null hypothesis is true.

Because even experts misinterpret the results of null hypothesis testing and because null hypothesis testing may not give us the information we necessarily want, some have called for the end of null hypothesis testing. Suggested alternative approaches to analyzing data include reporting on effect size, constructing confidence intervals, using Bayesian probabilities, conducting equivalence testing, or relying on *p*_{rep} (the probability of replication, see Kileen, 2005).