nif:isString
|
-
Each sample consisted of genomic DNA extracted from the gut of a single worker bee taken from the outer frames within colonies. Based on studies of the relationship between worker age and behavioral traits [12], bees in this location are expected to be guard bees, of ∼16 days of age although other workers might occasionally be included; a study of colonization of the worker gut suggested that colonization occurs by day 9 following emergence from the pupal stage [10]. We sampled 2 localities, consisting of the USDA Agricultural Research Service Bee Labs in Tucson, Arizona and in Beltsville, Maryland, on 4/28/2011 and 4/20/2011 respectively. In each location, 5 bees were sampled from each of 4 colonies, for a total of 40 samples representing individual bees. Bees were preserved in 95% ethanol after collection and prior to dissection. Whole guts from ventriculus to rectum were aseptically dissected from 5 randomly selected workers for each colony. The dissected guts were placed in a sterile 1.5 mL pestle tube with 710 µl buffer AG (200 mM NaCl, 200 mM Tris, 20 mM EDTA, plus 6% SDS) and were homogenized by maceration with scissors and then crushed with a disposable sterile pestle (Bel-Art Products). The homogenate was then added to a sterile bead tube containing 500 µl of phenol/chloroform/isoamyl pH 7.9 (Ambion) along with ∼500 µl 0.1 mm silica zirconia beads (BioSpec Products, Bartlesville, OK). The bead tubes were placed in a BioSpec high speed bead beater, beaten at the maximum setting for 3 min, then spun at 1000 RPM for 2 min. The resulting aqueous phase was extracted with a second phenol/chloroform/isoamyl preparation in a Light Phase Lock Gel tube (5 Prime). The aqueous phase of this extraction was collected and combined with 1/10 volumes sodium acetate pH 5.5 (American Bioanalytical) and an equal volume of isopropyl alcohol (American Bioanalytical). The samples were then allowed to incubate at −20°C overnight and then spun at 14,000 RPM for 30 min in a 4°C microcentrifuge. The pellets were washed with 70% ethanol and dried for 5 min in an unheated vacuum evaporator. The pellets were resuspended in 100 µl TE pH 8 (10 mM Tris pH 8 and 1 mM EDTA) and incubated for 30 min at 37°C with 2 µl RNAse A (Qiagen). These extracts were then further purified with a Qiagen QIAquick column and eluted in 30 uL Buffer EB (Qiagen). The final extracts were quantified using a Qubit dsDNA broad range assay (Invitrogen) and the resulting DNA samples were sent to the Joint Genome Institute (JGI).
At JGI, the V6–V8 regions of the 16S rRNA gene of the samples were amplified in triplicate using universal 16S rRNA primers adapted with 454 FLX Titanium sequences. The forward primer was 926F454 Tit F 5′- CCTATCCCCTGTGTGCCTTGGCAGTCTCAG-aaactYaaaKgaattgacgg-3′ (Lib B adapter is in caps) and the reverse barcoded primer was 1392R454 Tit R 5′- CCATCTCATCCCTGCGTGTCTCCGACTCAG-NNNNN-acgggcggtgtgtRc (Lib A is in caps and the variable barcode region is denoted by N's). Amplicons were sequenced using Roche 454 Titanium sequencing (for details see http://www.jgi.doe.gov/sequencing/protocols/index.html).
The retrieved sequences were processed by excluding sequences not matching the primers, with low quality for 10% or more of the sequence, or with fewer than 200 base pairs. These reads were deposited in the GenBank Sequence Read Archive (accession SRA046735). For analyses, all reads were trimmed to the same aligned 180 bp region. Retained sequences were then grouped into clusters of 100% identity and also into clusters of 97% or greater identity, using the “PyroTagger” pipeline [13]. These were classified into major taxonomic lineages using blastn against greengenes (greengenes.lbl.gov/) for prokaryotic sequences and against silva (www.arb-silva.de/) for eukaryotic sequences. Our primers amplified rRNA from some eukaryotes, and the clusters included many bee sequences (19% of total), some plant or chloroplast sequences presumably from ingested pollen or nectar (0.2% of total), and a few fungal, and microsporidian sequences (totaling fewer than 0.1% of total), in addition to bacterial sequences (81% of total); no archaeal sequences were retrieved. Since we were interested in bacterial diversity, eukaryotic (including chloroplast) clusters were removed. Clusters that represented fewer than 1% of bacterial sequences in every sample were removed; most were present as one or few sequences in a single individual and absent from most samples. Only 11 clusters remained, and these represented 98.5% of all bacterial sequences in the dataset. These clusters were considered as Operational Taxonomic Units (OTUs) in our analysis. The top blastn hits against the GenBank nucleotide database were determined for representative sequences of each of these 11 OTUs. The gut community for each bee was represented as the proportion of sequences derived from each of the 11 OTUs. To determine the sources of small clusters other than these 11, we used representative sequences from every bacterial cluster containing more than 3 sequences as queries in blastn searches against GenBank. Sequences with the same first hits were pooled, yielding only 10 groups. These 10 groups corresponded to the same 11 clusters as in the first analysis (with two pooled together) but also included small clusters closely related to the large clusters. These 10 clusters included 99.98% of the bacterial sequences. For richness and evenness estimates, EstimateS [14] was used. Individuals were treated as sample units for each colony (N = 5 per colony). For richness estimates, ACE (abundance-based coverage estimator) [15] estimates were computed. Evenness of bacterial OTUs within each colony was estimated using Simpson's measure of evenness (E1/D), which is the reciprocal form of Simpson's dominance index (D) [16] divided by the number of species in the sample [17]. An ANOVA was used to test mean differences of OTU evenness between AZ and MD sites [17]. Non-parametric Kolmogorov-Smirnov Z was used to test mean differences of OTU richness between AZ and MD sites [18]. To explore differences among individuals, colonies and sites, we used multivariate analyses of community profiles. OTUs were defined as the 11 sequence clusters comprising 98.5% of the bacterial sequences. In a second analysis, OTUs were defined as the 10 clusters corresponding to diagnostic blastn hits (above) and containing 99.98% of the sequences. The two approaches gave very similar results, and only the first is reported. Bacterial sequence reads per individual bee were standardized to the same sample size before multivariate community analyses were conducted. Standardization was carried out by randomly selecting 948 reads (the smallest sample size) per individual using Perl. For all community multivariate analyses, PCORD (version 4.25) was used [19]. Multi-Response Permutation Procedures MRPP [20] were used to test for differences among a priori groups (bee colonies and sites). Nonmetric multidimensional scaling (NMS) [21], [22] was used to compare bee gut microbiota assemblages among bee individuals. One extreme outlier (bee individual AZ_109_3) was removed, since extreme outliers distort ordination solutions. All criteria, distance measures, and parameters chosen are similar to those in Hansen et al. [23].
Briefly, a two-dimensional solution was chosen based on a combination of low stress, final instability, and P-values. 500 iterations were conducted. Non-parametric indicator species analysis (ISA) [24] was used to identify the OTUs that best described differences between sites, based on two independent measurements of species distribution, specificity and fidelity (i.e., an OTU was specific to a particular group (specificity) and widespread in all samples of that group (fidelity)). Potential indicator values (IndVal) that can result from ISA range from 0–100 where values >25 signify a good indicator [24]. Parameters and criteria are similar to those in Hansen et al. [23].
Strain variation within Snodgrassella and Gilliamella: Because pyrotag sequences were too short for fine-scale discrimination of strains within each phylotype, we obtained Sanger sequence data to assess the extent of strain variation within the Snodgrassella and Gilliamella phylotypes. Reverse primers were designed to complement sequences within the region used to align pyrosequence reads (positions between 1100 to 1300). The Snodgrassella primer (betacd1 5′- TTCGCTACCCTCTGTACCGACCATT - 3′) or Gilliamella primer (gma1cd2 5′-TCGCCTCCCTTTGTATACGCCATT-3′) were paired with 16S rRNA universal primer 27Fshort (5′-GAGTTTGATCCTGGCTCA-3′) in 25 µL reactions of 0.8 U Taq DNA Polymerase (New England BioLabs), 0.25 mM of each dNTP, 1x PCR buffer, and 0.5 µM concentrations of each primer. Fifty ng of DNA from individual bee samples: AZ100.3, AZ107.2, AZ109.2, AZ125.5, MD19.5, MD216.5, MD299.4 and MD365.2 were added to these reactions, and nuclease-free water was used in the negative control. These targeted regions of 16S rRNA sequence were amplified via PCR as follows: 4 min at 94°C; 35×30 s at 94°C; 30 s at 50°C, 1 min at 72°C; 10 min at 72°C. PCR products were visualized on a 1% agarose gel run at 90 V for 40 min and stained with ethidium bromide. Amplified PCR product was cleaned using Agencourt Ampure XP beads (Beckman Coulter), ligated into pGEM-T vector (Promega) and used to transform chemically competent DH5α E. coli (Invitrogen) using the manufacturer's recommended protocol. Twenty-four transformants were selected from each library, and inserts were verified by colony PCR using T7 (5′-TAATACGACTCACTATAGGG-3′) and SP6 (5′-ATTTAGGTGACACTATAG-3′) primers. The following amplification protocol was used: 1×94°C for 2 min; 28×10 s at 94°C, 20 s at 46°C for 20 s, 90 s at 72°C; 1×72°C for 1 min. Three µl of this reaction mix was run on an agarose gel and visualized to ensure incorporation of the approximately 1300 bp insert. DNA was then cleaned with Ampure beads and sent for bidirectional Sanger sequencing at the Yale Science Hill DNA Facility using T7 or SP6 sequencing primers. We checked returned sequences using the RDP Classifier v2.2 [25], and discarded a single sequence from the Snodgrassella set not matching Betaproteobacteria. The remaining sequences were curated using Geneious, version 5.5 [26]. They were trimmed of vector sequence and checked for quality. Poor quality reads (<800 bp) were discarded (1 from the Snodgrassella set and 7 from the Gilliamella set). Sequences were checked for chimeras using the Greengenes Bellerpheron tool (greengenes.lbl.gov/), resulting in elimination of three additional sequences from the Snodgrassella set. Remaining sequences were aligned for each set separately, using MUSCLE within Geneious with a maximum of 8 iterations. Equivocal base calls and gaps were inspected and corrected as necessary. Duplicate sequences were removed, and one of each was used for the alignment (see Table 1) All of the nearly full-length sequences were aligned along with representative sequences from bacteria from other studies of A. mellifera and of related host taxa (e.g. Bombus spp. ), and divergent sequences were used as queries in blastn searches of GenBank nucleotide data [27] to ensure that they had a strong match to the respective relevant phylotype. Three Gammaproteobacterial sequences from bee AZ109.2 had highest matches to Serratia species and were removed from the Gilliamella alignment. Blastn searches and neighbor-joining trees for the Gilliamella phylotype showed that a subset of sequences was derived from the related Gamma2 phylotype; these were excluded from analyses of strain variation. Alignments were edited and parsimony uninformative sites removed using MEGA [28]. Strain variation was assessed using DNAsp package [29] to estimate average pairwise sequence divergence, minimum number of recombination events, and polymorphisms restricted to a single locality or shared between localities. In the case of Gilliamella, a large number of recombination events was evident. A phylogenetic tree based on sequence data assumes clonal or near-clonal replication of the sequence and no recombination between sequences; the relationships of Gilliamella sequences would be better represented by a complex web than a tree. Therefore, no phylogenetic tree was constructed. In the case of Snodgrassella, a Neighbor-Joining tree was built using a Tamura-Nei distance model, within MEGA [28]. Representative database sequences from the Snodgrassella phylotype were included in the analysis. Sequences derived from Bombus formed a more divergent cluster, and these were used as an outgroup to the A. mellifera-derived sequences.
Table data removed from full text. Table identifier and caption: 10.1371/journal.pone.0036393.t001 Summary of Pyrotag Reads from Honey Bee Samples, Including the Representation of Known Bee Phylotypes. Final sequences for these phylotypes were deposited in GenBank with the accession numbers JQ581680-JQ582008 Characterization of 16S rRNA sequences for Gamma3 and Gamma4 phylotypes: To further characterize the novel clusters (Gamma3 and Gamma4), ∼50 ng of DNA from individual bee preps (AZ107.4, AZ109.1, and AZ109.3) was used to perform PCR amplification of near full-length 16S rRNA sequences, using the following universal primers: 27F′-HT (5′-AGRGTTTGATYMTGGCTCAG-3′) and 1492R′-HT (5′-GGYTACCTTGTTACGACTT-3′) [30]. PCR product was cleaned using Agencourt Ampure XP beads (Beckman Coulter), ligated into pGEM-T vector (Promega) and used to transform chemically competent DH5a E. coli (Invitrogen) using the manufacturer's recommended protocol. Transformants were selected from each individual bee's library, and inserts were screened by colony PCR using the T7 primer and Gamma3/Gamma4 specific primers. Lack of primer specificity resulted in no suitable sequences for Gamma3. Gamma4 was targeted by the reverse primer γ4R1 (5′- GTGCTACAATGGCGCATACA -3′) using a PCR protocol with initial denaturation of 94°C for 4 min followed by a 11 touchdown cycles of 30 s denaturation at 94°C, 20 s annealing step declining from 57°C to 46°C, and 1.5 min elongation step at 72°C, then 30 standard cycles with annealing at 46°C and final extension at 72°C for 5 min. Resultant positive amplifications were Sanger-sequenced at the Yale, Science Hill DNA Facility and aligned to the Gamma4 reads from the pyrosequencing dataset. Three nearly full length sequences showing identity or near identity (∼100%) to the Gamma4 reads were deposited in GenBank under the accession numbers JQ582009-JQ582011. Blastn of these Gamma4 sequences reveals similarity (≤96%) to members of the family Enterobacteriaceae.
|