Register |Login






Tools & Protocols | By Research Area

Software and online resources used in, or developed as part of the HMP as well as protocols describing new advances & technologies, can be found here as they become available.

If you have a protocol or software package that you would like to post on this site, or would like more information on the currently available content, please use the feedback form located in the upper-right hand of the site.

Please click here to view list sorted by Tools, Online Resources & Protocols.

Microbial Reference Genomes

Downloadable Tools
Core Gene Evaluation ScriptScreening for core gene sets as an indicator of completeness of draft genomes. This download includes a Perl script and required archaeal and bacterial core genes fasta and cluster files.
Online Resources
IMG System
A community resource for comparative analysis and annotation of publicly available genomes in a uniquely integrated context
Pathogen Portal
A set of web-based resources provided by the Bioinformatics Resource Centers (BRCs), focusing on organisms considered potential agents of biowarfare or bioterrorism or causing emerging or re-emerging diseases
RAST Annotation Server
A fully-automated service for annotating bacterial and archaeal genomes, leveraging data and procedures established within the SEED framework to provide high quality gene calling and functional annotation
Protocols
Common Gene Annotation Process
Reference Genomes Database
HMP single cell MDA 16S rRNA Sanger sequencing SOP
Strain selection guidelinesGuidelines for Reference Genome Strain selection
BEI contamination protocol

HMP Sequencing Center-specific Annotation Protocols

The initial set of 178 Bacterial Reference Genomes described in the 2010 publication, a Catalog of Reference Genomes from the Human Microbiome, were annotated using individual sequencing center methodologies:

Consensus Annotation Protocols

Subsequent Reference Genomes have been annotated using a consensus protocol for gene calling & functional annotation:
Provisional Reference Genome Assembly Metrics A set of quality control metrics run on every HMP Reference Genomes to ensure accuracy, completeness and continuity of draft and improved assemblies
Bacterial Core Gene Evaluation Protocol describing use of the Core Gene Evaluation Script to assess completeness of bacterial draft assemblies
Archaeal Core Gene Evaluation Protocol describing use of the Core Gene Evaluation Script to assess completeness of archaeal draft assemblies

Sampling, Sequencing, & Analyses of 16S RNA

Downloadable Tools
DNAclust
DNAclust is a fast clustering algorithm specifically designed for high-stringency clustering of DNA sequences, e.g. for 16S rRNA analyses or removal of duplicates/near duplicates in high-throughput shotgun datasets.
GINKGO
A GUI software package designed for non-statisticians to perform multivariate analysis
InVUE
A toolkit for rapid development of custom software packages for visualization and analysis of large datasets
Metastats
Metastats is a statistical package for comparing metagenomicdata-sets. Metastats was specifically designed for comparing clinical data comprising two treatment populations (e.g. sick vs. healthy) each comprising multiple samples, however thesoftware will also work for small number of samples. Metastats identifies features of the samples that "explain" the difference between the treatment populations. The features can be OTUs (e.g. inferred from 16S data), taxonomic groups, or other groupings (genes, functional groups, etc.) for which count data are available. Metastats primarily relies on a non-parametric t-test and reverts to Fisher's exact test for sparse features. Additional tests (presence/absence, odds ratios, etc.) are currently being implemented. Metastats is available as a web service, as standalone R and C code, as well as part of the Mothur package.
MicrobiomeUtilities A set of software utilities for processing and analyzing 16S rRNA genes including generating NAST alignments, chimera checking, and assembling paired 16S rRNA reads according to reference sequence homology
Mothur
A platform-independent software package for describing and comparing microbial communities. Mothur incorporates the functionality of a number of computational tools, calculators & visualization tools into a single program
Qiime
'Quantitative Insight Into Microbial Ecology'. Qiime allows a range of community analyses suitable for microbiome data using traditional and high-throughput sequencing methods
R-package: Hypothesis Testing and Power Calculations for Comparing Metagenomic Samples from HMP
This R-package provides several functions to perform formal hypothesis testing on the species abundance distribution of human microbiome data, and to calculate power and sample size requirements for human microbiome experiments.
R-package: Statistical Object Oriented Data Analysis of RDP-based Taxonomic trees from Human Microbiome Data: Modeling, Visualization, and Two-Group Comparison
This R-package introduces Object Oriented Data Analysis (OODA) methods to analyze Human Microbiome taxonomic trees directly, providing tools to model, compare, and visualize populations of taxonomic tree objects.
Simrank
A rapid and sensitive general-purpose k-mer search tool
speciateIT
A package for speciation of 16S sequences
Unifrac
A suite of tools for the comparison of microbial communities using phylogenetic information. It takes as input a single phylogenetic tree that contains sequences derived from at least two different environmental samples and a file describing which sequences came from which sample
Online Resources
Fast-Unifrac
Provides a suite of tools for the comparison of microbial communities using phylogenetic information
Greengenes
A 16S rRNA gene database and workbench compatible with ARB
RDP
Provides ribosome related data and services to the scientific community, including online data analysis and aligned and annotated Bacterial and Archaeal small-subunit 16S rRNA sequences
SitePainter
SitePainter allows users to visualize the different HMP body sites based on gradients of colors to represent available datasets
Protocols
Manual of Procedures (MOP) A reference document for current National Institutes of Health (NIH) policies and procedures as they apply to the Human Microbiome Project (HMP) Core Microbiome Sampling study
Core Microbiome Sampling Protocol
16S Data Flow for HMP Sequencing Centers Guidelines for the HMP sequencing Centers for submitting 16S rRNA gene data and metadata to the HMP DACC
Provisional 16S 454 Protocol

Sampling, Sequencing & Analysis of Whole Metagenomic Sequence

Downloadable Tools
BMTagger
NCBI's Best Match Tagger for removing human reads from metagenomics datasets. All HMP metagenomic sequence submitted to NCBI's Sequence Read Archive is being human filtered using BMTagger.
DeconSeq
Automatically detects and efficiently removes any type of sequence contamination from metagenomic datasets, including human or other host sequences. The tool uses a modified version of the BWA-SW aligner and can be applied to longer-read datasets (150+bp read length). DeconSeq is available as both standalone and web-based versions.
DNAclust
DNAclust is a fast clustering algorithm specifically designed for high-stringency clustering of DNA sequences, e.g. for 16S rRNA analyses or removal of duplicates/near duplicates in high-throughput shotgun datasets.
FragGeneScan
A short read gene finder
GINKGO
A GUI software package designed for non-statisticians to perform multivariate analysis
InVUE
A toolkit for rapid development of custom software packages for visualization and analysis of large datasets
Metamos
MetAmos is a pipeline for metagenomic assembly. It includes a collection of utilities for performing the assembly and for analyzing assembly output.
Metaphyler
Metaphyler is a software tool for inferring the taxonomic composition of a microbial community from whole-metagenome (WGS) sequencing data. Metaphyler relies on alignments to a curated database of housekeeping genes.
Metapath
Metapath is a statistical package for comparing metagenomic data-sets at the pathway level (using KEGG pathway information). Metapath relies on a graph-theoretic definition of statistical significance in order to identify pathway motifs that differ between samples from two treatment populations.
METAREP
An open source tool to help scientists to view, query, browse, and compare metagenomics annotation data derived from ORFs called on metagenomics reads or assemblies (also available as an Online Resource)
Metastats
Metastats is a statistical package for comparing metagenomicdata-sets. Metastats was specifically designed for comparing clinical data comprising two treatment populations (e.g. sick vs. healthy) each comprising multiple samples, however thesoftware will also work for small number of samples. Metastats identifies features of the samples that "explain" the difference between the treatment populations. The features can be OTUs (e.g. inferred from 16S data), taxonomic groups, or other groupings (genes, functional groups, etc.) for which count data are available. Metastats primarily relies on a non-parametric t-test and reverts to Fisher's exact test for sparse features. Additional tests (presence/absence, odds ratios, etc.) are currently being implemented. Metastats is available as a web service, as standalone R and C code, as well as part of the Mothur package.
PRINSEQ
A sequence processing tool that can be used to filter, reformat and trim genomic and metagenomic sequence data. It generates summary statistics of the input in graphical and tabular formats that can be used for quality control steps. PRINSEQ is available as both standalone and web-based versions.
Simrank
A rapid and sensitive general-purpose k-mer search tool
TagCleaner
Automatically detects and efficiently removes tag sequences (e.g. WTA or MID tags) from metagenomic datasets. TagCleaner is available as both standalone and web-based versions.
Online Resources
IMG/M
Provides tools for analyzing the functional capability of microbial communities based on their metagenome sequence, in the context of reference isolate genomes included from the Integrated Microbial Genomes (IMG) system
METAREP
A suite of web based tools to help scientists to view, query, browse, and compare metagenomics annotation data derived from ORFs called on metagenomics reads or assemblies (also available as a stand alone tool)
MG-RAST
A fully-automated service for annotating metagenome samples, providing annotation of sequence fragments, phylogenetic classification, metabolic reconstructions and comparison tools
Protocols
Manual of Procedures (MOP) A reference document for current National Institutes of Health (NIH) policies and procedures as they apply to the Human Microbiome Project (HMP) Core Microbiome Sampling study
Core Microbiome Sampling Protocol

Other Analysis

Protocols
RNA-Seq Enrichment Protocol