High Performance Processing Of Next Generation Sequencing Data On Cuda Enabled Gpus

Download and Read High Performance Processing Of Next Generation Sequencing Data On Cuda Enabled Gpus full books in PDF, ePUB, and Kindle. Read online free High Performance Processing Of Next Generation Sequencing Data On Cuda Enabled Gpus ebook anywhere anytime directly on your device. We cannot guarantee that every ebooks is available!

High-performance Processing of Next-generation Sequencing Data on CUDA-enabled GPUs

Author	: Felix Kallenborn
Publisher	:
Total Pages	: 0
Release	: 2024
Genre	:
ISBN	:

GET BOOK

Download High-performance Processing of Next-generation Sequencing Data on CUDA-enabled GPUs Book in PDF, Epub and Kindle

With the technological advances in the field of genomics and sequencing, the processing of vast amounts of generated data becomes more and more challenging. Nowadays, software for processing large-scale datasets of sequencing reads may take hours to days to complete, even on high-end workstations. This explains the need for new approaches to achieve faster, high-performance applications. In contrast to traditional CPU-based software, algorithms utilizing the massively-parallel many-core architecture and fast memory of GPUs are potentially able to deliver the desired performance in many fields. In this thesis, we introduce two novel GPU-accelerated applications, CARE and CAREx, for common steps in sequence processing pipelines, error correction and read extension of Next Generation Sequencing (NGS) Illumina data, to improve the results of down-stream data analysis. To the best of our knowledge, CARE and CAREx are the first modern GPU-accelerated solutions for the respective problems. A key component of our algorithm is the identification of similar DNA sequences within a dataset. For this purpose, we developed a minhashing-based index data structure for large-scale read datasets. In conjunction with our fast bit-parallel shifted hamming distance computations, this allows for the efficient identification of similar reads. The resulting set of similar sequences is subsequently arranged into a gap-free multiple-sequence alignment to solve the problem at hand. Sequencing machines introduce both systematic errors and random errors. CARE, Context-Aware Read Error corrector, accurately removes errors introduced by NGS sequencing machines during the initial sequencing of a biological sample. With the help of a pre-trained Random Forest, CARE generates two orders-of-magnitude fewer false positives than its competitors. At the same time, it shows similar numbers of true positives. Read extension describes the process of elongating DNA sequences. The presence of longer sequences improves the resolution of more, larger structures within a genome. CAREx, Context-Aware Read Extender, produces longer sequences, so called pseudo-long reads, by connecting the two reads of read pairs which were sequenced in close proximity. Evaluation shows that CAREx produces significantly more highly accurate pseudo-long reads than the state-of-the-art. With algorithms tailored towards high-performance GPU computations, both CARE and CAREx run significantly faster than the CPU-based competitors, while, at the same time, produce more accurate results. The processing of a large Human dataset with 30x coverage with CARE requires less than 30 minutes using a single A100 GPU. This time can be further reduced down to 10 minutes on multi-GPU systems. In contrast, CPU-based tools like Musket or BFC take 3 hours and 1.5 hours, respectively. Read extension of a Human dataset with CAREx takes 3.3 hours to complete on a single GPU, whereas Konnector2 requires over a day to complete. This shows that large-scale sequence processing can greatly benefit from the usage of GPUs, and that multiple-sequence alignment-based algorithms should be considered despite their increased complexity because they provide great accuracy. While our general building blocks have been tailored towards our needs for error correction and read extension, they could also prove useful in other GPU-accelerated applications that process sequence data.

High-performance Processing of Next-generation Sequencing Data on CUDA-enabled GPUs Related Books

Language: en
Pages: 0

High-performance Processing of Next-generation Sequencing Data on CUDA-enabled GPUs

Authors: Felix Kallenborn

Categories:

Type: BOOK - Published: 2024 - Publisher:

GET EBOOK

With the technological advances in the field of genomics and sequencing, the processing of vast amounts of generated data becomes more and more challenging. Now

Language: en
Pages: 3421

Encyclopedia of Bioinformatics and Computational Biology

Authors:

Categories: Medical

Type: BOOK - Published: 2018-08-21 - Publisher: Elsevier

GET EBOOK

Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics, Three Volume Set combines elements of computer science, information technology,

Language: en
Pages: 739

CUDA for Engineers

Authors: Duane Storti

Categories: Computers

Type: BOOK - Published: 2015-11-02 - Publisher: Addison-Wesley Professional

GET EBOOK

CUDA for Engineers gives you direct, hands-on engagement with personal, high-performance parallel computing, enabling you to do computations on a gaming-level P

Language: en
Pages: 0

Accelerating Bioinformatics Applications on CUDA-enabled Multi-GPU Systems

Authors: Robin Kobus

Categories:

Type: BOOK - Published: 2023 - Publisher:

GET EBOOK

A wide range of bioinformatics applications have to deal with a continuously growing amount of data generated by high-throughput sequencing techniques. Exclusiv

Language: en
Pages: 462

Computational Methods for Next Generation Sequencing Data Analysis

Authors: Ion Mandoiu

Categories: Computers

Type: BOOK - Published: 2016-09-12 - Publisher: John Wiley & Sons

GET EBOOK

Introduces readers to core algorithmic techniques for next-generation sequencing (NGS) data analysis and discusses a wide range of computational techniques and

High Performance Processing Of Next Generation Sequencing Data On Cuda Enabled Gpus

High-performance Processing of Next-generation Sequencing Data on CUDA-enabled GPUs

Download High-performance Processing of Next-generation Sequencing Data on CUDA-enabled GPUs Book in PDF, Epub and Kindle

High-performance Processing of Next-generation Sequencing Data on CUDA-enabled GPUs Related Books

High-performance Processing of Next-generation Sequencing Data on CUDA-enabled GPUs

Encyclopedia of Bioinformatics and Computational Biology

CUDA for Engineers

Accelerating Bioinformatics Applications on CUDA-enabled Multi-GPU Systems

Computational Methods for Next Generation Sequencing Data Analysis

Recent Books