HMGOI

In order to the profile for genes of interest, the HMP Genes of Interest working group compiled sequence databases for transporters, antibiotic resistance genes, virulence factors, carbohydrate-active enzymes and other functional categories (see HMFUNC). BlastX searches were performed for reads against these databases. Results were processed using the HUMAnN pipeline, developed by the HMP Metabolic Reconstruction Working Group. See the SOP or download page below for more information on using HUMAnN.

HUMAnN was run using Illumina reads from 743 samples as input. For each sample, we provide an abundance file indicating the abundance of each gene (ie how many copies are present in the community) as well as concatenated summary files by functional category containing all sample output.

In the individual Sample Data files below, the gene ID (GID) format for each database is as follows:

  • Antibiotic Resistance Genes - variety eg. AAA23410, NP_286204, P0AE07, YP_001461648, ZP_04651080
  • KEGG EC/KO - begins with organism name e.g.Parabacteroides_distasonis_ATCC_8503_6|640764945, Staphylococcus_aureus_RF122_21|637817257
  • MetaCyc/ENZYME - sp|XXXXXX|XXXXX_XXXXX, tr|XXXXXX|XXXXX_XXXXX, O#####,P#####,or Q##### eg. sp|Q9XTI0|3HIDH_CAEE, tr|Q9R9I4|Q9R9I4_BACSU, O25739, P95904, Q59456
  • Proteases - MER###### e.g. MER140051
  • Transporters - gnl|TC-DB|xxxxxxx e.g. gnl|TC-DB|A0CS82
  • Virulence Factors - VFG#### e.g. VFG0676
  • Carbohydrate-active enzymes (CAZY) - modo#######, e.g. modo137431, modo21899
Member Organizations