Computational Methods for Predicting and Validating the Causes of Mendelian Disease

Computational Methods for Predicting and Validating the Causes of Mendelian Disease
Author: Orion Josephson Buske
Publisher:
Total Pages:
Release: 2016
Genre:
ISBN:


Download Computational Methods for Predicting and Validating the Causes of Mendelian Disease Book in PDF, Epub and Kindle

We still do not know the genetic basis of roughly half of the estimated 7,000 Mendelian diseases. For some diseases, the responsible variants will not be discovered with standard genetic sequencing. If the variant falls outside of the exome, occurs in a repetitive region, or is a larger structural change, it is unlikely to be found by whole exome sequencing. For other diseases, the variant might be found, but the association with the disease has not yet been discovered. In these cases, making such a discovery requires several steps. Computational approaches are necessary to accurately prioritize harmful variants based on available knowledge. Then, additional information is needed to substantiate the association, such as functional tests, animal models, or identifying unrelated families with the same variant. This thesis presents several contributions to help researchers determine the genetic basis of unsolved Mendelian diseases: First, a method was developed that improves variant prioritization for a class of variants that are usually ignored by analysis pipelines: synonymous variants. After curating known examples from the literature, machine learning methods were trained to prioritize these variants based on a set of designed features. Second, finding additional families is a substantial hurdle in rare disease research. By collecting detailed phenotype information, computational methods can be used to find patients with a similar presentation. This leads to improved variant prioritization by combining sequencing data from several similar patients, without ever needing to explicitly define cohorts. The power of matchmaking methods grows exponentially with the size of the database, but simulations suggest that several hundred thousand cases are needed to identify the genetic basis of most Mendelian diseases. Finally, these matchmaking algorithms are implemented in a web portal, PhenomeCentral, which is used by several consortia and hundreds of clinicians and researchers. While this platform is a repository of several thousand undiagnosed cases, matchmaking between platforms is critical to achieve the numbers of cases predicted to be necessary. Towards this end, the Matchmaker Exchange (MME) was established and an API developed. Case profiles are exchanged within a secure federated network to reduce the time for researchers to validate genetic hypotheses.