Identification of SNP interactions using data-parallel primitives on GPUs

Muzaffer Can Altinigneli, Bettina Konte, Dan Rujescir, Christian Böhm, Claudia Plant

Veröffentlichungen: Beitrag in BuchBeitrag in KonferenzbandPeer Reviewed


A major goal of a Genome Wide Association Study (GWAS) is to find associations between genetic variations, such as Single-Nucleotide Polymorphisms (SNPs) and the risk for developing a complex disease, such as cancer or schizophrenia. Logic Feature Selection (logicFS) is a technique to search for interactions between SNPs possibly enhancing the risk to develop a particular disease. Composed of several hundreds of processors, the Graphics Processing Unit (GPU) has become a very interesting platform for computationally demanding tasks on massive data. A special hierarchy of processors and fast memory units allow very powerful and efficient parallelization but also demands novel parallel algorithms. In this paper, we formulate LogicFS-GPU algorithm particularly suited for the data parallel architectures, such as GPUs. For this purpose, we employ low (or device) level and high level data parallel primitives, e.g. map, compaction, parallel-prefix-sum (scan) and parallel reduction. The primary idea of our algorithm is to allow the parallel threads developing cooperatively their own private high quality binary interaction models to predict the affection status of subjects. We demonstrate (1) how to formulate the parallel LogicFS-GPU algorithm to be able to exploit most of the potential parallelism hidden in the base logicFS algorithm and (2) how to utilize the special memory and processor architecture of a modern GPU in order to share this information among threads in an optimal way. As a perspective, LogicFS-GPU is not limited examining SNP interactions, but can also be applied to any problem in which multi-variate binary predictor interactions are tried to be associated with observations. Furthermore, the target architecture of LogicFS-GPU is not only constrained by GPU and it may be possible to port our formulation to any other target data-parallel architecture.

Titel2014 IEEE International Conference on Big Data
Redakteure*innenWo Chang, Jun Huan, Nick Cercone, Saumyadipta Pyne, Vasant Honavar, Jimmy Lin, Xiaohua Tony Hu, Charu Aggarwal, Bamshad Mobasher, Jian Pei, Raghunath Nambiar
ISBN (elektronisch)9781479956654
PublikationsstatusVeröffentlicht - 1 Okt. 2014

ÖFOS 2012

  • 102033 Data Mining