4.3. Molecular selection tools

Note: many of the methods mentioned below are outlined in the BEEBOOK paper on molecular research techniques (Evans et al., 2013).

The completion of the honey bee genome project held the promise for fast selection of colonies with desirable traits (Weinstock et al., 2006). Knowing the genes coding for any particular trait would, in theory, allow for the selection of queens and drones with desired genotypes for further breeding without evaluation of colony traits. However, at present much knowledge is still needed before delivery on this promise can come through. Complications further arise from the complexity of honey bee genetics. It seems that those colonies that perform best, do so due to a high level of genetic diversity amongst the workers (Seeley and Tarpy, 2007). The colony composition of two generations in form of the queen and her worker offspring and the combinational effects of mostly more than ten chromosome sets due to the multiple matings of the queen. This makes the role that selection for a single trait at individual level can play questionable, especially when transferred into colony performance. In more advanced and complex breeding programmes, genome-wide marker assisted selection may boost accuracy of genetic improvement in honey bees (Meuwissen et al., 2001). The recent developments in sequencing single nucleotide polymorphisms (Harismendy et al., 2009) and bioinformatics’ approaches in data evaluation (Pérez-Sato et al., 2010) can make breeding programmes for honey bees more reliable. However, such an approach needs considerable resources and expensive laboratory work.

Even before completion of the honey bee genome, scientists started the search for quantitative trait loci (QTL) in honey bees using different kinds of markers:

  • Hunt et al. (1995) used bees preselected for variation in their pollen hoarding behaviour to search for the underlying genetic traits. Using genetic markers derived from a technique called random amplification of polymorphic DNA (RAPD), they identified first two and later a third marker (Page et al., 2000). Each marker held predictive power, concerning the preference of a given forager for the collection of either nectar or pollen. The RAPD loci observed are not thought to be directly responsible for the variance in the traits, they are merely closely linked to a genetic region that primes the bees’ behaviour in the direction of pollen or nectar collection.
  • Using similar RAPD markers with the addition of DNA microsatellites and a sequence tagged site, Lapidge et al. (2002) detected seven loci linked to hygienic behaviour in honey bees. This finding conflicts with the only two loci described by Rothenbuhler (1964; see however Moritz, 1988); still it may result from the usage of strains less extremely selected as compared to the earlier studies.
  • Today RAPD are all but forgotten, as is their cousin methodology of amplified fragment length polymorphism (AFLP) used by Rüppelt et al. (2004

A variety of markers with accurate linkage maps today exist for the preliminary screening for QTL:

  • At first, the DNA microsatellites carefully mapped by Solignac et al. (2004) became the marker of choice.
  • Since the genomic information became available (Weinstock et al., 2006), single nucleotide polymorphism (SNPs) also allow cheap and accurate targeting of QTL. Recently a marker set of 44000 has become commercially available (Spötter et al., 2011), providing a robust coverage of the honey bee genome. Using this set of markers in a study of “varroa-specific defence behaviour”, it has been shown that it is important to examine several control populations to avoid randomly significant SNPs. In the study at hand, more than 151 SNP differed between the reference sample of “varroa-defence bees” and a set of bees from completely unhygienic colonies, against 7 SNPs differing between varroa-defence bees and related workers not engaging in defensive behaviour, taken at the highest level of significance. Comparing all three groups, merely a single SNP remained. This result demonstrates the value of having appropriate samples available.


The current rapid developments in availability and pricing of DNA sequencing may eventually replace all these linkage bound methods with a direct sequence based search for the underlying genetic variance for each trait.

  • A separate methodology to identify marker genes has emerged from the use of microarray techniques. Microarrays consist of a set of known honey bee genes. Using the microarray allows for the detection of mRNA levels in specific workers. The microarrays are built based on expressed sequence tags (EST) results from mRNA of bees, which after cDNA transformation are cloned and can be analysed rather swiftly (Whitfield, 2002). Based on genetic information from Drosophila melanogaster many of the gene functions are well known. An example of the application of this technique is the study of honey bee brood reaction to parasitism by varroa mites (Navajas, 2008). The strength of this technique lies in the immediate detection of differential gene activity in bees with variable traits. It is thus feasible to directly identify the action of genes related to specific traits. The currently available microarrays allow for the screening of more than 8000 genes identified from the honey bee brain. Any gene unidentified or not included in the microarray however, will go undetected. This is particularly important for those promoter regions that act as switches for coding genes, as these are likely to go unnoticed from such studies.
  • While interactions between coding genes and their regulator genes may go unnoticed by microarray techniques, the use of SNP markers might be particular suitable for the detection of promoter regions. In humans two independent SNPs have been shown to generate lactose tolerance in adults (Tishkoff, 2007).

QTL methods are particularly applicable to honey bees, due to the rather small genome with a high rate of recombination. Furthermore, the haploid stage of the drone allows for direct testing of traits linked to the individual level, but it remains more complex for colony level traits. If workers can be observed to harbour a significant fraction of a colony’s traits, like those engaging in hygienic behaviour, these too can be employed for these type of studies. Due to the multiple matings of the queen with haploid drones, a colony will typically consist of more than 10 subfamilies. Each subfamily, often referred to as a “patriline”, effectively acts as linkage group sharing the paternal fraction of the genome. Bees with a particular patriline are variable for the remaining queen contributions. This allows for the testing of genotype interactions, both at the individual worker level and at colony level. Finding QTLs or genes affecting complex colony traits, like swarming behaviour, honey production or gentleness will demand thorough testing and considerable skills both at the molecular and computational level. The main problem remains, i.e. to demonstrate, in a considerable set of colonies, that heritable variance exists for the trait of choice. Only once a large sample size is available, representing both variation and similarity between the screened colonies, would it seem worthwhile to conduct a molecular genetic screening. 

A caveat in the interpretation of genetic marker data results from the vast number of genes screened, either genetically mapped markers or from microarray studies. Chance differences in marker diversity between tested bees or in the activity of genes unrelated to the trait under study are rather likely given the vast number of comparisons. Hence it is advisable to demand particular strict statistical testing, before accepting a particular marker as involved. One way to reduce this problem is to repeat the study in several independent populations.

While the arrival of molecular markers will allow for rapid selection, some words of caution are needed. It may seem straightforward to select for the identified genotype in a separate population, if this has been found to be associated with particular valuable traits. As a shortcut, it may be equally tempting to inter-cross a set of genes into an unrelated population, and based on marker assisted selection follow their fate in following generations. Organisms resulting from this technique have been termed cis genetically modified organisms, in contrast to trans genetically modified organisms, as the genetic exchange happens via traditional interbreeding, and genes are not introduced from other completely unrelated species. In theory it could be possible to incorporate a single gene into an unrelated population, however, unless considerable care is taken this will go hand in hand with a significant genetic bottleneck. Whether consumers, be it beekeepers or honey buyers, will accept such cis techniques as being less problematic than standard trans GM techniques remains an open question. Furthermore, searching for identical genotype variations in unrelated populations hold no warranty for success, as our knowledge of the complex underlying mechanisms are still rather rudimentary. While the future of honey bee breeding may benefit from more advanced molecular methods, it is still an emerging field.