The new sequencing technologies revolutionize genomics as they promise low-cost, high-throughput sequencing (HTS) of both new species and different individuals to better analyze the patterns of genetic variation. These “next-generation” platforms started to contribute our understanding of human genome diversity with the 1000 Genomes Project that employs the HTS methods to produce the most detailed map of human variation. Other large scale sequencing projects are being initiated to characterize genomes to assess characteristics of human genome diversity, to find genetic causes for disease, and infer the evolutionary history of species. Although we can now generate data at a rate previously unimaginable, the analysis of the data is proceeding at a slower pace as currently available algorithms to analyze HTS data show different biases against different classes of variation. There is a need to forge an alliance between computer science and genomics to devise better methods to use the massive amount of sequence data. Here we propose to develop novel algorithms to comprehensively and quickly discover all forms of genomic variants including point mutations, indel polymorphisms and structural variation while resolving inconsistencies among different variants to accurately identify normal and disease-causing variation. The proposed project, when completed, will help better make use of the data generated by the HTS platforms by enabling complete and accurate analysis of genomic variants in newly sequenced human individuals, and non-human organisms. Better analysis in a timely fashion also opens the way to discover more variants that might be medically relevant. Moreover, accurate and complete characterization of genomic variants within the most complex regions of the human genome may help solve the problem of “missing heritability” in complex disease that is not readily addressed by conventional genome-wide association studies.
Fields of science
Call for proposal
See other projects for this call