
24 December 2014
By Zannah Salter

Summary of the metagenomic data for samples of the bacterium S. bongori, extracted with four different kits. As the starting material becomes more diluted, the proportion of sequenced reads mapping to the S. bongori reference genome decreases and contamination becomes more prominent. b) The profile of the non-Salmonella reads is different for each of the four kits.
Credit: DOI: 10.1186/s12915-014-0087-z
Microorganisms can be found in almost every environment - for example in complex, dense communities in soil or in the human gut, or sparse numbers in the air or even circulating in your bloodstream.
Communities of bacteria, the microbiota, can be identified by sequencing all their DNA (metagenomics) or by sequencing a target such as the 16S rRNA genes which are ubiquitous among bacteria.
However, in low-biomass samples, there is very little bacterial DNA to analyse. This makes the results vulnerable to outside contamination from DNA present in the lab environment or in reagents, substances used in the processing of samples. We have seen a number of reports of low biomass environments (such as the skin, blood, or extreme environments) containing startlingly similar taxa that are not necessarily biologically expected - such as rhizosphere (plant root-associated) bacteria in human tissues.
Contaminant DNA
It has been reported over the years that reagents including PCR polymerase and ultrapure water may contain bacterial DNA [1, 2]. We noticed that when we sequenced our negative controls in past microbiota studies, we observed small numbers of environmental bacteria which appeared to correlate to a particular DNA extraction kit batch. We suspected that DNA extraction kits could be another source of DNA contamination, and performed an experiment to test this hypothesis.
Testing the impact of contaminant DNA
Replicating the process at the Wellcome Trust Sanger Institute, University of Birmingham and Imperial College London, we sequenced the 16S rRNA gene content of a pure Salmonella bongori culture. The samples were diluted to span high biomass (approximately 10^8 cells/ml) down to low biomass (10^4 cells/ml).
As it was a pure culture, we should have observed just one species in the results, no matter how much it was diluted. However, what we found was a mixture of 270 different genera of bacteria. As the sample became more dilute, the Salmonella made up a smaller and smaller proportion of the results and the other bacteria appeared more dominant.
We then sequenced the metagenome of these samples, comparing different types of kit, and found a different profile of bacteria in each one (see the figure above).
Fixing the problem
We concluded that there is a good chance of contamination being introduced from commonly used DNA extraction kits as well as from other reagents or the wider laboratory environment. There are two solutions:
a) attempt to remove the DNA from the reagents before use, although this is difficult to achieve without compromising the effectiveness of certain enzymes and buffers
b) identify and remove the contaminants from the data informatically, after sequencing
Sequencing copious negative controls is useful for identifying contamination, and unusual results - such as certain bacteria being present in an environment where they have never before been observed - should be corroborated by culture or in situ visualisation.
Zannah Salter has been a Research Assistant with the Pathogen Genomics group at the Wellcome Trust Sanger Institute since 2008. She works on the human microbiota.
References
- Tanner MA, et al (1998). Specific Ribosomal DNA Sequences from Diverse Environmental Settings Correlate with Experimental Contaminants. Applied and Environmental Microbiology. PMCID: PMC106828
- Grahn N et al (2003). Identification of mixed bacterial DNA contamination in broad-range PCR amplification of 16S rDNA V1 and V3 variable regions by pyrosequencing of cloned amplicons. FEMS Microbiology Letters . DOI: 10.1016/S0378-1097(02)01190-4
- Salter SJ et al (2014). Reagent and laboratory contamination can critically impact sequence-based microbiome analyses. BMC Biology. DOI:10.1186/s12915-014-0087-z
Related Links: