1.2. Confidence level, Type I and Type II errors, and Power

For experiments, once we know what kind of data we have, we should consider the desired confidence level of the statistical test. This confidence is expressed as α; it gives one the probability of making a Type I error (Table 1) which occurs when one rejects a true null hypothesis. Typically that level for α is set at 0.05, meaning that we are 95% confident (1 – α = 0.95) that we will not make a Type I error, i.e. 95% confident that we will not reject a true null hypothesis. For many commonly used statistical tests, the p-value is the probability that the test statistic calculated from the observed data occurred by chance, given that the null hypothesis is true. If p < α we reject the null hypothesis; if p ≧ α we do not reject the null hypothesis.

A Type II error, expressed as the probability ‘ß’ occurs when one fails to reject a false null hypothesis. Unlike α, the value of ß is determined by properties of the experimental design and data, as well as how different results need to be from those stipulated under the null hypothesis to make one believe the alternative hypothesis is true. Note that the null hypothesis is, for all intents and purposes, rarely true. That is, even if a treatment has very little effect, it has some small effect, and given a sufficient sample size, its effect could be detected. However, our interest is more often in biologically important effects and those with practical importance. For example, a treatment for parasites that is hardly better than no treatment, even if it could be shown to be statistically significant with a sufficiently large sample size, may be of no practical importance to a beekeeper. This should be kept in mind in subsequent discussions of sample size and effect size.

The power or the sensitivity of a test can be used to determine sample size (see section 3.2.) or minimum effect size (see section 3.1.3.). Power is the probability of correctly rejecting the null hypothesis when it is false (power = 1 – ß), i.e. power is the probability of not committing a Type II error (when the null hypothesis is false) and hence the probability that one will identify a significant effect when such an effect exists. As power increases, the chance of a Type II error decreases. A power of 80% (90% in some fields) or higher seems generally acceptable. As a general comment the words "power", "sensitivity", "precision", "probability of detection" are / can be used synonymously.

Table 1. The different types of errors in hypothesis-based statistics.


The null hypothesis (H0) is

Statistical result



Reject null hypothesis

Type I error,

α value = probability of falsely rejecting H0

Probability of correctly rejecting H0: (1 - ß) = power

Accept null hypothesis

Probability of correctly accepting H0 : (1 - α)


Type II error,

ß value = probability of falsely accepting H0