ConStrains identifies microbial strains in metagenomic datasets.

Nat Biotechnol
Authors
Keywords
Abstract

An important fraction of microbial diversity is harbored in strain individuality, so identification of conspecific bacterial strains is imperative for improved understanding of microbial community functions. Limitations in bioinformatics and sequencing technologies have to date precluded strain identification owing to difficulties in phasing short reads to faithfully recover the original strain-level genotypes, which have highly similar sequences. We present ConStrains, an open-source algorithm that identifies conspecific strains from metagenomic sequence data and reconstructs the phylogeny of these strains in microbial communities. The algorithm uses single-nucleotide polymorphism (SNP) patterns in a set of universal genes to infer within-species structures that represent strains. Applying ConStrains to simulated and host-derived datasets provides insights into microbial community dynamics.

Year of Publication
2015
Journal
Nat Biotechnol
Volume
33
Issue
10
Pages
1045-52
Date Published
2015 Oct
ISSN
1546-1696
URL
DOI
10.1038/nbt.3319
PubMed ID
26344404
PubMed Central ID
PMC4676274
Links
Grant list
U54 DK102557 / DK / NIDDK NIH HHS / United States
P01 DK078669 / DK / NIDDK NIH HHS / United States
P30 DK043351 / DK / NIDDK NIH HHS / United States
R01 DK092405 / DK / NIDDK NIH HHS / United States
R01 HG004872 / HG / NHGRI NIH HHS / United States
Howard Hughes Medical Institute / United States