# 2.2.4. Alpha diversity

Estimators of
within-community (alpha) diversity have been proposed and refined for decades (Whittaker, 1972; Magurran,
2004). For NGS surveys
of bacterial symbionts, three measurements of alpha diversity are commonly used:
rarefaction curves, species richness estimators (often in conjunction with
rarefaction curves), and community diversity indices. Bee-associated bacterial surveys commonly
report all three measures, but it should be noted that the abundance of 16S
amplicon sequences can be a poor predictor of relative bacterial abundances (Amend *et al.*, 2010). Estimates of within and
between community diversity that rely on 16S amplicon sequence abundance should
therefore be interpreted with caution.
Recently, a method to account for 16S gene copy number in estimating
bacterial abundance was developed (Kembel *et
al.*, 2012), which may help improve the accuracy of bacterial diversity
measurements based on 16S amplicons.

- Species richness estimators
estimate the total number of species present in a community. The
*Chao 1*index is commonly used, and is based upon the number of rare classes (i.e. OTUs) found in a sample (Chao, 1984):

where S_{est }is the estimated number of species, S_{obs}is the observed number of species, f_{1}is the number of singleton taxa (taxa represented by a single read in that community), and f_{2}is the number of doubleton taxa. If a sample contains many singletons, it is likely that more undetected OTUs exist, and the*Chao 1*index will estimate greater species richness than it would for a sample without rare OTUs. Besides the Chao1 estimator, mothur includes several other species richness estimators and a wrapped version of CatchAll, which calculates 12 different estimators and proposes a best estimate of species richness (Bunge*et al.*, 2012). Qiime also includes the Chao1 estimator along with several other species richness estimators. - Rarefaction curves are used to determine whether sampling depth was sufficient to accurately characterize the bacterial community being studied. To build rarefaction curves, each community is randomly subsampled without replacement at different intervals, and the average number of OTUs at each interval is plotted against the size of the subsample (Gotelli and Colwell, 2001). The point at which the number of OTUs does not increase with further sampling is the point at which enough samples have been taken to accurately characterize the community. Mothur and QIIME will both calculate rarefaction for observed and estimated species richness. QIIME will additionally create graphs of rarefaction curves, while mothur outputs results that can be imported into graphing software.
- Community diversity indices
combine species richness and abundance into a single value of evenness.
Communities that are numerically dominated by one or a few species exhibit low
evenness while communities where abundance is distributed equally amongst
species exhibit high evenness (Gotelli, 2008). Two of the most widely used indices are the
Shannon (or Shannon-Wiener) index (Shannon, 1948) and Simpson’s index (Simpson, 1949). A recommended index that
is not sensitive to sample size is the Probability of an Interspecific
Encounter (PIE [Hurlbert, 1971]):

where N is the sample size, p_{i}is the proportion of the sample that is made up of individuals of species i, and S is the number of species in the sample. PIE is bounded between 0 (a community comprised of a single species), and 1 (a community comprised of an infinite number of equally abundant species), but is not currently included in either mothur or QIIME. Both mothur and QIIME include multiple community diversity indices.