Tools & Protocols | By Research Area
Software and online resources used in, or developed as part of the HMP as well as protocols describing new advances & technologies, can be found here as they become available.
If you have a protocol or software package that you would like to post on this site, or would like more information on the currently available content, please use the feedback form located in the upper-right hand of the site.
Please click here to view list sorted by Tools, Online Resources & Protocols.
Microbial Reference Genomes
| Downloadable Tools | |
| Core Gene Evaluation ScriptScreening for core gene sets as an indicator of completeness of draft genomes. This download includes a Perl script and required archaeal and bacterial core genes fasta and cluster files. | |
| Online Resources | |
|
IMG System A community resource for comparative analysis and annotation of publicly available genomes in a uniquely integrated context |
|
| Pathogen Portal A set of web-based resources provided by the Bioinformatics Resource Centers (BRCs), focusing on organisms considered potential agents of biowarfare or bioterrorism or causing emerging or re-emerging diseases |
|
| RAST Annotation Server A fully-automated service for annotating bacterial and archaeal genomes, leveraging data and procedures established within the SEED framework to provide high quality gene calling and functional annotation |
|
| Protocols | |
| Common Gene Annotation Process | |
| Reference Genomes Database | |
| HMP single cell MDA 16S rRNA Sanger sequencing SOP | |
| Strain selection guidelinesGuidelines for Reference Genome Strain selection | |
| BEI contamination protocol | |
HMP Sequencing Center-specific Annotation Protocols The initial set of 178 Bacterial Reference Genomes described in the 2010 publication, a Catalog of Reference Genomes from the Human Microbiome, were annotated using individual sequencing center methodologies: |
|
Consensus Annotation Protocols Subsequent Reference Genomes have been annotated using a consensus protocol for gene calling & functional annotation: |
|
| Provisional Reference Genome Assembly Metrics A set of quality control metrics run on every HMP Reference Genomes to ensure accuracy, completeness and continuity of draft and improved assemblies | |
| Bacterial Core Gene Evaluation Protocol describing use of the Core Gene Evaluation Script to assess completeness of bacterial draft assemblies | |
| Archaeal Core Gene Evaluation Protocol describing use of the Core Gene Evaluation Script to assess completeness of archaeal draft assemblies | |
Sampling, Sequencing, & Analyses of 16S RNA
| Downloadable Tools |
| DNAclust DNAclust is a fast clustering algorithm specifically designed for high-stringency clustering of DNA sequences, e.g. for 16S rRNA analyses or removal of duplicates/near duplicates in high-throughput shotgun datasets. |
| GINKGO A GUI software package designed for non-statisticians to perform multivariate analysis |
| InVUE A toolkit for rapid development of custom software packages for visualization and analysis of large datasets |
| Metastats Metastats is a statistical package for comparing metagenomicdata-sets. Metastats was specifically designed for comparing clinical data comprising two treatment populations (e.g. sick vs. healthy) each comprising multiple samples, however thesoftware will also work for small number of samples. Metastats identifies features of the samples that "explain" the difference between the treatment populations. The features can be OTUs (e.g. inferred from 16S data), taxonomic groups, or other groupings (genes, functional groups, etc.) for which count data are available. Metastats primarily relies on a non-parametric t-test and reverts to Fisher's exact test for sparse features. Additional tests (presence/absence, odds ratios, etc.) are currently being implemented. Metastats is available as a web service, as standalone R and C code, as well as part of the Mothur package. |
| MicrobiomeUtilities A set of software utilities for processing and analyzing 16S rRNA genes including generating NAST alignments, chimera checking, and assembling paired 16S rRNA reads according to reference sequence homology |
| Mothur A platform-independent software package for describing and comparing microbial communities. Mothur incorporates the functionality of a number of computational tools, calculators & visualization tools into a single program |
| Qiime 'Quantitative Insight Into Microbial Ecology'. Qiime allows a range of community analyses suitable for microbiome data using traditional and high-throughput sequencing methods |
| R-package: Hypothesis Testing and Power Calculations for Comparing Metagenomic Samples from HMP This R-package provides several functions to perform formal hypothesis testing on the species abundance distribution of human microbiome data, and to calculate power and sample size requirements for human microbiome experiments. |
| R-package: Statistical Object Oriented Data Analysis of RDP-based Taxonomic trees from Human Microbiome Data: Modeling, Visualization, and Two-Group Comparison This R-package introduces Object Oriented Data Analysis (OODA) methods to analyze Human Microbiome taxonomic trees directly, providing tools to model, compare, and visualize populations of taxonomic tree objects. |
| Simrank A rapid and sensitive general-purpose k-mer search tool |
| speciateIT A package for speciation of 16S sequences |
| Unifrac A suite of tools for the comparison of microbial communities using phylogenetic information. It takes as input a single phylogenetic tree that contains sequences derived from at least two different environmental samples and a file describing which sequences came from which sample |
| Online Resources |
| Fast-Unifrac Provides a suite of tools for the comparison of microbial communities using phylogenetic information |
| Greengenes A 16S rRNA gene database and workbench compatible with ARB |
| RDP Provides ribosome related data and services to the scientific community, including online data analysis and aligned and annotated Bacterial and Archaeal small-subunit 16S rRNA sequences |
| SitePainter SitePainter allows users to visualize the different HMP body sites based on gradients of colors to represent available datasets |
| Protocols |
| Manual of Procedures (MOP) A reference document for current National Institutes of Health (NIH) policies and procedures as they apply to the Human Microbiome Project (HMP) Core Microbiome Sampling study |
| Core Microbiome Sampling Protocol |
| 16S Data Flow for HMP Sequencing Centers Guidelines for the HMP sequencing Centers for submitting 16S rRNA gene data and metadata to the HMP DACC |
| Provisional 16S 454 Protocol |
Sampling, Sequencing & Analysis of Whole Metagenomic Sequence
| Downloadable Tools |
| BMTagger NCBI's Best Match Tagger for removing human reads from metagenomics datasets. All HMP metagenomic sequence submitted to NCBI's Sequence Read Archive is being human filtered using BMTagger. |
| DeconSeq Automatically detects and efficiently removes any type of sequence contamination from metagenomic datasets, including human or other host sequences. The tool uses a modified version of the BWA-SW aligner and can be applied to longer-read datasets (150+bp read length). DeconSeq is available as both standalone and web-based versions. |
| DNAclust DNAclust is a fast clustering algorithm specifically designed for high-stringency clustering of DNA sequences, e.g. for 16S rRNA analyses or removal of duplicates/near duplicates in high-throughput shotgun datasets. |
| FragGeneScan A short read gene finder |
| GINKGO A GUI software package designed for non-statisticians to perform multivariate analysis |
| InVUE A toolkit for rapid development of custom software packages for visualization and analysis of large datasets |
| Metamos MetAmos is a pipeline for metagenomic assembly. It includes a collection of utilities for performing the assembly and for analyzing assembly output. |
| Metaphyler Metaphyler is a software tool for inferring the taxonomic composition of a microbial community from whole-metagenome (WGS) sequencing data. Metaphyler relies on alignments to a curated database of housekeeping genes. |
| Metapath Metapath is a statistical package for comparing metagenomic data-sets at the pathway level (using KEGG pathway information). Metapath relies on a graph-theoretic definition of statistical significance in order to identify pathway motifs that differ between samples from two treatment populations. |
| METAREP An open source tool to help scientists to view, query, browse, and compare metagenomics annotation data derived from ORFs called on metagenomics reads or assemblies (also available as an Online Resource) |
| Metastats Metastats is a statistical package for comparing metagenomicdata-sets. Metastats was specifically designed for comparing clinical data comprising two treatment populations (e.g. sick vs. healthy) each comprising multiple samples, however thesoftware will also work for small number of samples. Metastats identifies features of the samples that "explain" the difference between the treatment populations. The features can be OTUs (e.g. inferred from 16S data), taxonomic groups, or other groupings (genes, functional groups, etc.) for which count data are available. Metastats primarily relies on a non-parametric t-test and reverts to Fisher's exact test for sparse features. Additional tests (presence/absence, odds ratios, etc.) are currently being implemented. Metastats is available as a web service, as standalone R and C code, as well as part of the Mothur package. |
| PRINSEQ A sequence processing tool that can be used to filter, reformat and trim genomic and metagenomic sequence data. It generates summary statistics of the input in graphical and tabular formats that can be used for quality control steps. PRINSEQ is available as both standalone and web-based versions. |
| Simrank A rapid and sensitive general-purpose k-mer search tool |
| TagCleaner Automatically detects and efficiently removes tag sequences (e.g. WTA or MID tags) from metagenomic datasets. TagCleaner is available as both standalone and web-based versions. |
| Online Resources |
| IMG/M Provides tools for analyzing the functional capability of microbial communities based on their metagenome sequence, in the context of reference isolate genomes included from the Integrated Microbial Genomes (IMG) system |
| METAREP A suite of web based tools to help scientists to view, query, browse, and compare metagenomics annotation data derived from ORFs called on metagenomics reads or assemblies (also available as a stand alone tool) |
| MG-RAST A fully-automated service for annotating metagenome samples, providing annotation of sequence fragments, phylogenetic classification, metabolic reconstructions and comparison tools |
| Protocols |
| Manual of Procedures (MOP) A reference document for current National Institutes of Health (NIH) policies and procedures as they apply to the Human Microbiome Project (HMP) Core Microbiome Sampling study |
| Core Microbiome Sampling Protocol |
Other Analysis
| Protocols |
| RNA-Seq Enrichment Protocol |
