Login






16S rRNA Trimmed Data Set

Raw 16S rRNA sequence reads must be processed before they can be used to infer useful taxonomic information. This page describes the baseline processing and analysis performed by the DACC on HMP 16S rRNA data, specifically Production Phase 1 (PP1) 16S rRNA data from the HMP Center "Healthy Cohort", which is available from the NCBI Sequence Read Archive as Study SRP002395. This represents 7518 preparations from 5034 samples. 16S variable region V3-5 was sequenced for all samples, with variable regions V1-3 and V6-9 sequenced for subsets of the samples. 18 body sub-sites are represented in this dataset. More information about sampling and sequencing can be found on the Microbiome Analysis page.

Processing began with deconvolution and trimming of 16S sequences downloaded from SRA study SRP002395. The trimmed data set was then fed into pipeline that ran the following analysis steps:

  • A. 16S reference alignment via the NAST-iEr alignment tool
  • B. Chimera identification via the ChimeraSlayer
  • C. Aberrant sequence identification via WigeoN
  • the above tools are all available as components of the Broad Institute's Microbiome Utilities Portal
  • D. Taxonomic binning using the RDP classifier

See the SOP below for more detail. This SOP has also been applied to the other 2 phases of the Healthy Cohort study: Production Phase 2 (PP2), SRP002860; and Pre-Production Pilot (PPS), SRP002012. That data will be made available via this page soon.


SRA runs containing a total of about 10,000 reads could not be successfully converted to SFF by the sffdump utility in the NCBI SRA SDK and have been excluded from this initial release.

Ongoing HMP 16S analyses are being performed on a dataset containing reads from both SRA Study ids SRP002395 (Human Microbiome Project 16S rRNA 454 Clinical Production Phase I) and SRP002012 (Human Microbiome Project 454 Clinical Production Pilot, PPS). This dataset currently represents only the former project. We are in the process of readying SRP002012 reads for release on this site.

Protocols and Tools