Accelerating Bioinformatics Applications on CUDA-enabled Multi-GPU Systems

Accelerating Bioinformatics Applications on CUDA-enabled Multi-GPU Systems
Author: Robin Kobus
Publisher:
Total Pages: 0
Release: 2023
Genre:
ISBN:


Download Accelerating Bioinformatics Applications on CUDA-enabled Multi-GPU Systems Book in PDF, Epub and Kindle

A wide range of bioinformatics applications have to deal with a continuously growing amount of data generated by high-throughput sequencing techniques. Exclusively CPU-based workstations fail to keep up with the task. Instead of employing dozens of CPU cluster nodes to increase the computational power, massively parallel accelerators like modern CUDA-enabled GPUs can be used to achieve higher throughput and reduce execution times. However, memory capacity of such devices is often limited. Efficient parallelization and data distribution are essential to accelerate performance critical components of bionformatics pipelines like read classification and read mapping. In this thesis we analyze and optimize tasks common to many GPU-based applications in the context of bioinformatics. We study sequence processing, construction and querying of k-mer-based hash tables, segmented sort as well as multi-GPU communication. With these methods we accelerate suffix array construction and metagenomic read classification on CUDA-enabled GPUs by overcoming the aforementioned challenges. By leveraging multiple GPUs, we extend the limited memory available from a single GPU to allow for the construction of larger indices. Our communication library, called Gossip, introduces optimized scatter, gather and all-to-all patterns for multi-GPU systems. Gossip's all-to-all communication pattern is successfully applied to suffix array construction, accelerating it to run in 3.44 s for a full-length human genome on an 8-GPU server, which is faster than previously reported 4.8 seconds achieved by employing 1600 cores on 100 nodes on a CPU-based HPC cluster. Furthermore, we introduce MetaCache-GPU -- an ultra-fast metagenomic short read classifier specifically tailored to fit the characteristics of CUDA-enabled accelerators. Our approach employs a novel hash table variant featuring efficient minhash fingerprinting of reads for locality-sensitive hashing and their rapid insertion using warp-aggregated operations. Our performance evaluation shows that MetaCache-GPU is able to build large reference databases in a matter of seconds, enabling instantaneous operability, while popular CPU-based tools such as Kraken2 require over an hour for index construction on the same data. In the light of an ever-growing number of reference genomes, MetaCache-GPU is the first metagenomic classifier that makes analysis pipelines with on-demand composition of large-scale reference genome sets practical. Although many sub-problems in this thesis are optimized in a specific application context, they also apply to other bioinformatics problems like k-mer counting, sequence alignment and assembly, which would benefit from GPU acceleration. In addition to the insights from this work, we make our source code publicly available to allow for easier adaptation of our methods to related problems.


Accelerating Bioinformatics Applications on CUDA-enabled Multi-GPU Systems
Language: en
Pages: 0
Authors: Robin Kobus
Categories:
Type: BOOK - Published: 2023 - Publisher:

GET EBOOK

A wide range of bioinformatics applications have to deal with a continuously growing amount of data generated by high-throughput sequencing techniques. Exclusiv
Implementing and Accelerating HMMER3 Protein Sequence Search on CUDA-Enabled GPU.
Language: en
Pages:
Authors: Lin Cheng
Categories:
Type: BOOK - Published: 2014 - Publisher:

GET EBOOK

High-performance Processing of Next-generation Sequencing Data on CUDA-enabled GPUs
Language: en
Pages: 0
Authors: Felix Kallenborn
Categories:
Type: BOOK - Published: 2024 - Publisher:

GET EBOOK

With the technological advances in the field of genomics and sequencing, the processing of vast amounts of generated data becomes more and more challenging. Now
Computational Science and Its Applications - ICCSA 2014
Language: en
Pages: 842
Authors: Beniamino Murgante
Categories: Computers
Type: BOOK - Published: 2014-07-02 - Publisher: Springer

GET EBOOK

The six-volume set LNCS 8579-8584 constitutes the refereed proceedings of the 14th International Conference on Computational Science and Its Applications, ICCSA
Proceedings of ICRIC 2019
Language: en
Pages: 897
Authors: Pradeep Kumar Singh
Categories: Technology & Engineering
Type: BOOK - Published: 2019-11-21 - Publisher: Springer Nature

GET EBOOK

This book presents high-quality, original contributions (both theoretical and experimental) on software engineering, cloud computing, computer networks & intern