Skip to main navigation Skip to search Skip to main content

Simple User-Friendly Reaction Format

  • David Benjamin Konrad
  • , David Nippa
  • , Alex Müller
  • , Kenneth Atz
  • , Uwe Grether
  • , Rainer Martin
  • , Gisbert Schneider

Publications: Contribution to journalArticlePeer Reviewed

Abstract

Utilizing the growing wealth of chemical reaction data can boost synthesis planning and increase success rates. Yet, the effectiveness of machine learning tools for retrosynthesis planning and forward reaction prediction relies on accessible, well-curated data presented in a structured format. Although some public and licensed reaction databases exist, they often lack essential information about reaction conditions. To address this issue and promote the principles of findable, accessible, interoperable, and reusable (FAIR) data reporting and sharing, we introduce the Simple User-Friendly Reaction Format (SURF). SURF standardizes the documentation of reaction data through a structured tabular format, requiring only a basic understanding of spreadsheets. This format enables chemists to record the synthesis of molecules in a format that is understandable by both humans and machines, which facilitates seamless sharing and integration directly into machine learning pipelines. SURF files are designed to be interoperable, easily imported into relational databases, and convertible into other formats. This complements existing initiatives like the Open Reaction Database (ORD) and Unified Data Model (UDM). At Roche, SURF plays a crucial role in democratizing FAIR reaction data sharing and expediting the chemical synthesis process.

Original languageEnglish
Article numbere202400361
JournalMolecular Informatics
Volume44
Issue number1
DOIs
Publication statusPublished - 23 Jan 2025
Externally publishedYes

Austrian Fields of Science 2012

  • 102019 Machine learning
  • 102033 Data mining
  • 102035 Data science
  • 104015 Organic chemistry

Keywords

  • chemical reactions
  • machine learning
  • FAIR data

Fingerprint

Dive into the research topics of 'Simple User-Friendly Reaction Format'. Together they form a unique fingerprint.

Cite this