# 3.2.3. Sample size and individual infection rates

Common topics in honey bee research are
pathogens. Prevalence of pathogens can be determined in a colony or at a
population level (see section 2.2.). Most likely, the data will be based on
whether in the smallest tested unit the pathogen is present or not: a binomial
distribution. Hence, sample size will be largely dependent on __detection
probability__ of a pathogen. However, with viruses (and possibly other
pathogens), concentration of virus particles is measured on a logarithmic scale
(Gauthier* et al.*, 2007;
Brunetto* et al.*, 2009). This means, for example, that the virus titre of a
pooled sample is disproportionately determined by the one bee with the highest
individual titre. For the assumption of normality in many parametric analyses
we suggest a power-transformation of these data (Box and Cox, 1964; Bickel and Doksum, 1981). For further reading on sample-size determination for log-normal
distributed variables, see Wolfe and Carlin (1999).

In summary, a minimum sample of 30 independent observations per treatment (and the lowest level of independence will almost always be cages) may be desirable, but constraints and large effect sizes will lower this quantity, especially for experiments using groups of caged honey bees. Because of this, development of methods for maintaining workers individually in cages for a number of weeks should be investigated. This would be an advantage because depending on the experimental question, each honey bee could be considered to be an independent experimental unit. The same principles of experimental design that apply to the recommended number of cages also apply to other levels of experimental design, such as honey bees per cage, with smaller effect sizes and more complex questions, recommended sample sizes necessarily increase (in other words the more variables/factors included, the greater the sample size has to be). Researchers must think about, and be able to justify, how many of their replicates are truly independent; 30 replicates is a reasonable starting point to aim for when effect sizes are unknown, but again, this may not be realistic. In the context of wax producing and comb building, colony size and queen status play a role. For example, comb construction only takes place in the presence of a queen and at least 51 workers, and egg-laying occurs only if a mated queen is surrounded by at least 800 workers (reviewed in Hepburn, 1986; page 156). Additionally, novel experiments on new sets of variables means uncertainty in outcomes, but more importantly means uninformed experimental designs that may be less than optimal. Designs should always be scrutinised and constantly improved by including preliminary trials, which could, for example, provide a better idea of prevalence resulting in a better estimate of the required sample size.