Using Jupyter Notebooks for re-training machine learning models

Publications: Contribution to journalArticlePeer Reviewed

Abstract

Machine learning (ML) models require an extensive, user-driven selection of molecular descriptors in order to learn from chemical structures to predict actives and inactives with a high reliability. In addition, privacy concerns often restrict the access to sufficient data, leading to models with a narrow chemical space. Therefore, we propose a framework of re-trainable models that can be transferred from one local instance to another, and further allow a less extensive descriptor selection. The models are shared via a Jupyter Notebook, allowing the evaluation and implementation of a broader chemical space by keeping most of the tunable parameters pre-defined. This enables the models to be updated in a decentralized, facile, and fast manner. Herein, the method was evaluated with six transporter datasets (BCRP, BSEP, OATP1B1, OATP1B3, MRP3, P-gp), which revealed the general applicability of this approach.
Original languageEnglish
Article number54
Number of pages9
JournalJournal of Cheminformatics
Volume14
Issue number1
DOIs
Publication statusPublished - 13 Aug 2022

Austrian Fields of Science 2012

  • 301207 Pharmaceutical chemistry

Keywords

  • Classification models
  • Transporter proteins
  • Decentralization
  • Re-training
  • Jupyter Notebook
  • Bile SALT EXPORT PUMP
  • ANION-TRANSPORTING POLYPEPTIDES
  • P-GP
  • DRUG-INTERACTIONS
  • LIVER-INJURY
  • INHIBITORS
  • CLASSIFICATION

Cite this