Cloud-based Approach on Genetic Data Imputation Parameters' Optimization

Publications: Contribution to bookContribution to proceedingsPeer Reviewed

Abstract

The imputation process for genetic data is cost and time-intensive, primarily due to the high complexity of the methods involved, and the substantial volume of data processed. A thorough performance evaluation of the imputation algorithms such as Beagle, AlphaPlantImpute, LinkImputeR, MACH and others shows that while some algorithms are highly accurate, they are often computationally expensive. Being widely used, they have multiple input parameters which impact the quality and accuracy of the imputation. Traditional machine learning techniques for parameter optimization like grid search and randomized search become inefficient in high-dimensional parameter spaces, leading to prohibitive computational costs, especially in large-scale applications. Our study proposes the cloud-based approach for input parameters optimization by using Bayesian optimization with consecutive Domain Reduction Transformer (DRT). Described algorithm and developed library allow users to find the optimal input parameters for the data imputation in a more flexible way.
Original languageEnglish
Title of host publicationProceedings of the 7th International Conference on Informatics & Data-Driven Medicine (IDDM 2024)
Subtitle of host publicationIDDM International Conference
EditorsNataliia Shakhovska, Jianbo Jiao, Ivan Izonin, Stephane Chretien
Place of PublicationBirmingham
Pages279-286
Number of pages8
Publication statusPublished - 7 Jan 2025
EventIDDM 2024 International Conference on Informatics & Data-Driven Medicine 2024 - Birmingham, United Kingdom
Duration: 14 Nov 202416 Nov 2024
https://science.lpnu.ua/iddm-2024

Publication series

SeriesCEUR Workshop Proceedings
Volume3892
ISSN1613-0073

Conference

ConferenceIDDM 2024 International Conference on Informatics & Data-Driven Medicine 2024
Abbreviated titleIDDM 2024
Country/TerritoryUnited Kingdom
CityBirmingham
Period14/11/2416/11/24
Internet address

Austrian Fields of Science 2012

  • 502050 Business informatics
  • 102038 Cloud computing
  • 101015 Operations research

Keywords

  • Bayesian optimization
  • Beagle
  • bioinformatics 1
  • cloud technologies
  • data imputation
  • distributed calculations
  • parameters optimization

Fingerprint

Dive into the research topics of 'Cloud-based Approach on Genetic Data Imputation Parameters' Optimization'. Together they form a unique fingerprint.

Cite this