TY - JOUR
T1 - Logistic PCA explains differences between genome-scale metabolic models in terms of metabolic pathways
AU - Zehetner, Leopold
AU - Széliová, Diana
AU - Kraus, Barbara
AU - Hernandez Bort, Juan A.
AU - Zanghellini, Jürgen
N1 - Publisher Copyright:
© 2024 Zehetner et al.
Accession Number
WOS:001253501400003
PubMed ID
38913731
PY - 2024/6
Y1 - 2024/6
N2 - Genome-scale metabolic models (GSMMs) offer a holistic view of biochemical reaction networks, enabling in-depth analyses of metabolism across species and tissues in multiple conditions. However, comparing GSMMs Against each other poses challenges as current dimensionality reduction algorithms or clustering methods lack mechanistic interpretability, and often rely on subjective assumptions. Here, we propose a new approach utilizing logisitic principal component analysis (LPCA) that efficiently clusters GSMMs while singling out mechanistic differences in terms of reactions and pathways that drive the categorization. We applied LPCA to multiple diverse datasets, including GSMMs of 222 Escherichia-strains, 343 budding yeasts (Saccharomycotina), 80 human tissues, and 2943 Firmicutes strains. Our findings demonstrate LPCA’s effectiveness in preserving microbial phylogenetic relationships and discerning human tissue-specific metabolic profiles, exhibiting comparable performance to traditional methods like t-distributed stochastic neighborhood embedding (t-SNE) and Jaccard coefficients. Moreover, the subsystems and associated reactions identified by LPCA align with existing knowledge, underscoring its reliability in dissecting GSMMs and uncovering the underlying drivers of separation.
AB - Genome-scale metabolic models (GSMMs) offer a holistic view of biochemical reaction networks, enabling in-depth analyses of metabolism across species and tissues in multiple conditions. However, comparing GSMMs Against each other poses challenges as current dimensionality reduction algorithms or clustering methods lack mechanistic interpretability, and often rely on subjective assumptions. Here, we propose a new approach utilizing logisitic principal component analysis (LPCA) that efficiently clusters GSMMs while singling out mechanistic differences in terms of reactions and pathways that drive the categorization. We applied LPCA to multiple diverse datasets, including GSMMs of 222 Escherichia-strains, 343 budding yeasts (Saccharomycotina), 80 human tissues, and 2943 Firmicutes strains. Our findings demonstrate LPCA’s effectiveness in preserving microbial phylogenetic relationships and discerning human tissue-specific metabolic profiles, exhibiting comparable performance to traditional methods like t-distributed stochastic neighborhood embedding (t-SNE) and Jaccard coefficients. Moreover, the subsystems and associated reactions identified by LPCA align with existing knowledge, underscoring its reliability in dissecting GSMMs and uncovering the underlying drivers of separation.
UR - http://www.scopus.com/inward/record.url?scp=85196781835&partnerID=8YFLogxK
U2 - 10.1371/journal.pcbi.1012236
DO - 10.1371/journal.pcbi.1012236
M3 - Article
AN - SCOPUS:85196781835
SN - 1553-734X
VL - 20
JO - PLoS Computational Biology
JF - PLoS Computational Biology
IS - 6
M1 - e1012236
ER -