TY - JOUR
T1 - DiSCo: a sequence-based type-specific predictor of Dsr-dependent dissimilatory sulphur metabolism in microbial data
AU - Neukirchen, Sinje
AU - Sousa, Filipa L
N1 - Funding Information:
This project has received funding from the Wiener Wissenschafts-, Forschungs-und Technologiefonds (grant agreement VRG15-007) to FLS.
Publisher Copyright:
© 2021 The Authors.
PY - 2021/7
Y1 - 2021/7
N2 - Current methods in comparative genomic analyses for metabolic potential prediction of proteins involved in, or associated with the Dsr (dissimilatory sulphite reductase)-dependent dissimilatory sulphur metabolism are both time-intensive and computationally challenging, especially when considering metagenomic data. We developed DiSCo, a Dsr-dependent dissimilatory sulphur metabolism classification tool, which automatically identifies and classifies the protein type from sequence data. It takes user-supplied protein sequences and lists the identified proteins and their classification in terms of protein family and predicted type. It can also extract the sequence data from user-input to serve as basis for additional downstream analyses. DiSCo provides the metabolic functional prediction of proteins involved in Dsr-dependent dissimilatory sulphur metabolism with high levels of accuracy in a fast manner. We ran DiSCo against a dataset composed of over 190 thousand (meta)genomic records and efficiently mapped Dsr-dependent dissimilatory sulphur proteins in 1798 lineages across both prokaryotic domains. This allowed the identification of new micro-organisms belonging to Thaumarchaeota and Spirochaetes lineages with the metabolic potential to use the Dsr-pathway for energy conservation. DiSCo is implemented in Perl 5 and freely available under the GNU GPLv3 at https://github.com/Genome-Evolution-and-Ecology-Group-GEEG/DiSCo.
AB - Current methods in comparative genomic analyses for metabolic potential prediction of proteins involved in, or associated with the Dsr (dissimilatory sulphite reductase)-dependent dissimilatory sulphur metabolism are both time-intensive and computationally challenging, especially when considering metagenomic data. We developed DiSCo, a Dsr-dependent dissimilatory sulphur metabolism classification tool, which automatically identifies and classifies the protein type from sequence data. It takes user-supplied protein sequences and lists the identified proteins and their classification in terms of protein family and predicted type. It can also extract the sequence data from user-input to serve as basis for additional downstream analyses. DiSCo provides the metabolic functional prediction of proteins involved in Dsr-dependent dissimilatory sulphur metabolism with high levels of accuracy in a fast manner. We ran DiSCo against a dataset composed of over 190 thousand (meta)genomic records and efficiently mapped Dsr-dependent dissimilatory sulphur proteins in 1798 lineages across both prokaryotic domains. This allowed the identification of new micro-organisms belonging to Thaumarchaeota and Spirochaetes lineages with the metabolic potential to use the Dsr-pathway for energy conservation. DiSCo is implemented in Perl 5 and freely available under the GNU GPLv3 at https://github.com/Genome-Evolution-and-Ecology-Group-GEEG/DiSCo.
KW - comparative genomics, dissimilatory sulphur oxidation, dissimilatory sulphate reduction, genotype-phenotype association, microbial physiology
KW - OXIDATION
KW - GEN-NOV
KW - PROTEIN
KW - dissimilatory sulphate reduction
KW - ARCHAEOGLOBUS-FULGIDUS
KW - BACTERIUM ALLOCHROMATIUM-VINOSUM
KW - CRYSTAL-STRUCTURE
KW - comparative genomics
KW - genotype-phenotype association
KW - SULFATE-REDUCTION
KW - dissimilatory sulphur oxidation
KW - microbial physiology
KW - PURIFICATION
KW - SULFITE REDUCTASE
KW - DSRMKJOP COMPLEX
KW - Comparative genomics
KW - Dissimilatory sulphate reduction
KW - Microbial physiology
KW - Genotype-phenotype association
KW - Dissimilatory sulphur oxidation
UR - http://www.scopus.com/inward/record.url?scp=85111331210&partnerID=8YFLogxK
U2 - 10.1099/mgen.0.000603
DO - 10.1099/mgen.0.000603
M3 - Article
C2 - 34241589
SN - 2057-5858
VL - 7
JO - Microbial genomics
JF - Microbial genomics
IS - 7
M1 - 000603
ER -