Register |Login






Tools & Protocols | By Type

Software and online resources used in, or developed as part of the HMP, as well as protocols describing new advances & technologies can be found here as they become available.

If you have a protocol or software package that you would like to post on this site, or would like more information on the currently available content, please use the feedback form located in the upper-right hand of the site.

Please click here to view list sorted by Research Area.

Downloadable Tools

Core Gene Evaluation ScriptScreening for core gene sets as an indicator of completeness of draft genomes. This download includes a Perl script and required archaeal and bacterial core genes fasta and cluster files.
DNAclust
DNAclust is a fast clustering algorithm specifically designed for high-stringency clustering of DNA sequences, e.g. for 16S rRNA analyses or removal of duplicates/near duplicates in high-throughput shotgun datasets.
GINKO
A GUI software package designed for non-statisticians to perform multivariate analysis
HUMAnN
The HMP Unified Metabolic Analysis Network (HUMAnN) is a pipeline for efficiently and accurately determining the presence/absence and abundance of microbial pathways in a community from metagenomic data (WGS). The pipeline converts sequence reads into coverage and abundance tables summarizing the gene families and pathways in one or more microbial communities.
InVUE
A toolkit for rapid development of custom software packages for visualization and analysis of large datasets
LEfSe
LDA Effect Size is an algorithm for high-dimensional biomarker discovery and explanation that identifies metagenomic features (genes, pathways, or taxa) characterizing the differences between two or more biological conditions. In can be applied to taxonomic or functional abundance tables derived from metagenomic (WGS) data or 16S OTU/phylotype data.
Metamos
MetAmos is a pipeline for metagenomic assembly. It includes a collection of utilities for performing the assembly and for analyzing assembly output.
Metapath
Metapath is a statistical package for comparing metagenomic data-sets at the pathway level (using KEGG pathway information). Metapath relies on a graph-theoretic definition of statistical significance in order to identify pathway motifs that differ between samples from two treatment populations.
MetaPhlAn
A computational tool for profiling the composition of microbial communities from metagenomic data (WGS). MetaPhlAn relies on unique clade-specific marker genes identified from reference genomes, allowing very fast computational times, unambiguous taxonomic assignments, and species-level resolution.
Metaphyler
Metaphyler is a software tool for inferring the taxonomic composition of a microbial community from whole-metagenome (WGS) sequencing data. Metaphyler relies on alignments to a curated database of housekeeping genes.
Metastats
Metastats is a statistical package for comparing metagenomicdata-sets. Metastats was specifically designed for comparing clinical data comprising two treatment populations (e.g. sick vs. healthy) each comprising multiple samples, however thesoftware will also work for small number of samples. Metastats identifies features of the samples that "explain" the difference between the treatment populations. The features can be OTUs (e.g. inferred from 16S data), taxonomic groups, or other groupings (genes, functional groups, etc.) for which count data are available. Metastats primarily relies on a non-parametric t-test and reverts to Fisher's exact test for sparse features. Additional tests (presence/absence, odds ratios, etc.) are currently being implemented. Metastats is available as a web service, as standalone R and C code, as well as part of the Mothur package.
MicrobiomeUtilities A set of software utilities for processing and analyzing of 16S rRNA genes, encompassing
Mothur
A platform-independent software package for describing and comparing microbial communities; Mothur incorporates the functionality of a number of computational tools, calculators & visualization tools into a single program
Qiime
A pipeline for performing microbial community analysis that integrates many standard third party tools and addresses the problem of taking sequencing data from raw sequences to interpretation and database deposition
R-package: Hypothesis Testing and Power Calculations for Comparing Metagenomic Samples from HMP
This R-package provides several functions to perform formal hypothesis testing on the species abundance distribution of human microbiome data, and to calculate power and sample size requirements for human microbiome experiments.
R-package: Statistical Object Oriented Data Analysis of RDP-based Taxonomic trees from Human Microbiome Data: Modeling, Visualization, and Two-Group Comparison
This R-package introduces Object Oriented Data Analysis (OODA) methods to analyze Human Microbiome taxonomic trees directly, providing tools to model, compare, and visualize populations of taxonomic tree objects.
Simrank
A rapid and sensitive general-purpose k-mer search tool
speciateIT
A package for speciation of 16S sequences
Unifrac
A software package designed to differentiate between samples by measuring the phylogenetic distance of taxa using tree topology and branch lengths to determine if populations are significantly different and determine which factors might be important for those differences sequence alignment, chimera detection, OUT binning & sequence assembly

Online Resources

Fast-Unifrac
Provides a suite of tools for the comparison of microbial communities using phylogenetic information
Greengenes
A 16S rRNA gene database and workbench compatible with ARB RDP - Provides ribosome related data and services to the scientific community, including online data analysis and aligned and annotated Bacterial and Archaeal small-subunit 16S rRNA sequences
IMG System
A community resource for comparative analysis and annotation of publicly available genomes in a uniquely integrated context
IMG/M
Provides tools for analyzing the functional capability of microbial communities based on their metagenome sequence, in the context of reference isolate genomes included from the Integrated Microbial Genomes (IMG) system
MG-RAST
A fully-automated service for annotating metagenome samples, providing annotation of sequence fragments, phylogenetic classification, metabolic reconstructions and comparison tools
Pathogen Portal
A set of web-based resources provided by the Bioinformatics Resource Centers (BRCs), focusing on organisms considered potential agents of biowarfare or bioterrorism or causing emerging or re-emerging diseases
RAST Annotation Server
A fully-automated service for annotating bacterial and archaeal genomes, leveraging data and procedures established within the SEED framework to provide high quality gene calling and functional annotation
RDP
Provides ribosome related data and services to the scientific community, including online data analysis and aligned and annotated Bacterial and Archaeal small-subunit 16S rRNA sequences
SitePainter
SitePainter allows users to visualize the different HMP body sites based on gradients of colors to represent available datasets

Protocols

Common Gene Annotation Process
Reference Genomes Database
HMP single cell MDA 16S rRNA Sanger sequencing SOP
Human Sequence Removal
SFF and Library Metadata File Generation
16S rRNA mothur Curation Pipeline
QIIME Community Profiling SOP
HMP WGS Read Processing
HMP Whole-Metagenome Assembly
Body Site Assembly
GO Slim Analysis
Functional Database SOP
HUMAnN SOP
HMP Hybrid Assembly
Metagenomics Annotation SOP
Functional Database SOP
RNA-Seq Enrichment Protocol
Strain selection guidelinesGuidelines for Reference Genome Strain selection
BEI contamination protocol
HMP Sequencing Center-specific Annotation Protocols
The initial set of 178 Bacterial Reference Genomes described in the 2010 publication, a Catalog of Reference Genomes from the Human Microbiome, were annotated using individual sequencing center methodologies:
Consensus Annotation Protocols
Subsequent Reference Genomes have been annotated using a consensus protocol for gene calling & functional annotation:
Provisional Reference Genome Assembly Metrics A set of quality control metrics run on every HMP Reference Genomes to ensure accuracy, completeness and continuity of draft and improved assemblies
Bacterial Core Gene Evaluation Protocol describing use of the Core Gene Evaluation Script to assess completeness of bacterial draft assemblies
Archaeal Core Gene Evaluation Protocol describing use of the Core Gene Evaluation Script to assess completeness of archaeal draft assemblies
Core Microbiome Sampling Protocol
16S Data Flow for HMP Sequencing Centers Guidelines for the HMP sequencing Centers for submitting 16S rRNA gene data and metadata to the HMP DACC
Provisional 16S 454 Protocol