TY - JOUR
T1 - Interpretable machine learning methods for predictions in systems biology from omics data
AU - Sidak, David
AU - Schwarzerová, Jana
AU - Weckwerth, Wolfram
AU - Waldherr, Steffen
N1 - Funding Information:
JS has been supported by grant FEKT-K-21-6878 realised within the project Quality Internal Grants of BUT (KInG BUT), Reg. No. CZ.02.2.69/0.0/0.0/19_073/0016948, which is financed from the OP RDE.
Funding Information:
JS has been supported by grant FEKT-K-21-6878 realised within the project Quality Internal Grants of BUT (KInG BUT), Reg. No. CZ.02.2.69/0.0/0.0/19_073/0016948, which is financed from the OP RDE.
Publisher Copyright:
Copyright © 2022 Sidak, Schwarzerová, Weckwerth and Waldherr.
PY - 2022/10/17
Y1 - 2022/10/17
N2 - Machine learning has become a powerful tool for systems biologists, from diagnosing cancer to optimizing kinetic models and predicting the state, growth dynamics, or type of a cell. Potential predictions from complex biological data sets obtained by “omics” experiments seem endless, but are often not the main objective of biological research. Often we want to understand the molecular mechanisms of a disease to develop new therapies, or we need to justify a crucial decision that is derived from a prediction. In order to gain such knowledge from data, machine learning models need to be extended. A recent trend to achieve this is to design “interpretable” models. However, the notions around interpretability are sometimes ambiguous, and a universal recipe for building well-interpretable models is missing. With this work, we want to familiarize systems biologists with the concept of model interpretability in machine learning. We consider data sets, data preparation, machine learning methods, and software tools relevant to omics research in systems biology. Finally, we try to answer the question: “What is interpretability?” We introduce views from the interpretable machine learning community and propose a scheme for categorizing studies on omics data. We then apply these tools to review and categorize recent studies where predictive machine learning models have been constructed from non-sequential omics data.
AB - Machine learning has become a powerful tool for systems biologists, from diagnosing cancer to optimizing kinetic models and predicting the state, growth dynamics, or type of a cell. Potential predictions from complex biological data sets obtained by “omics” experiments seem endless, but are often not the main objective of biological research. Often we want to understand the molecular mechanisms of a disease to develop new therapies, or we need to justify a crucial decision that is derived from a prediction. In order to gain such knowledge from data, machine learning models need to be extended. A recent trend to achieve this is to design “interpretable” models. However, the notions around interpretability are sometimes ambiguous, and a universal recipe for building well-interpretable models is missing. With this work, we want to familiarize systems biologists with the concept of model interpretability in machine learning. We consider data sets, data preparation, machine learning methods, and software tools relevant to omics research in systems biology. Finally, we try to answer the question: “What is interpretability?” We introduce views from the interpretable machine learning community and propose a scheme for categorizing studies on omics data. We then apply these tools to review and categorize recent studies where predictive machine learning models have been constructed from non-sequential omics data.
KW - deep learning
KW - explainable artificial intelligence
KW - interpretable machine learning
KW - metabolomics
KW - multi-omics
KW - proteomics
KW - transcriptomics
UR - http://www.scopus.com/inward/record.url?scp=85141880659&partnerID=8YFLogxK
U2 - 10.3389/fmolb.2022.926623
DO - 10.3389/fmolb.2022.926623
M3 - Review
AN - SCOPUS:85141880659
SN - 2296-889X
VL - 9
JO - Frontiers in Molecular Biosciences
JF - Frontiers in Molecular Biosciences
M1 - 926623
ER -