Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS
Content archived on 2024-05-30

Algorithms for Analysis of Genes and Genomes

Final Report Summary - ALGGENOMES (Algorithms for Analysis of Genes and Genomes)

FP7-people-IRG 224285 ALGGENOMES

Algorithms for analysis of genes and genomes

Tomas Vinar, PhD

Faculty of mathematics, physics, and informatics, Comenius university,

Mlynska Dolina, 842 48 Bratislava, Slovakia

E-mail: vinar@fmph.uniba.sk

Project website: http://compbio.fmph.uniba.sk/(opens in new window)

Publishable summary 15 January 2009 {14 January 2013

ALGGENOMES is a Marie Curie project supporting reintegration of Dr T. Vinar at the faculty of mathematics, physics and informatics of Comenius university in Bratislava, Slovakia. Dr Vinar has spent almost ten years of his research career in Canada and the United States (US) after which he decided to return to Slovakia in order to start a bioinformatics research and education program there. The main goals of the project, as outlined in the grant agreement, are as follows:
A Development of algorithms for bioinformatics. Design of new algorithms and probabilistic models for a variety of bioinformatics problems in sequence analysis and gene evolution, their implementation, and application of the resulting tools to the analysis of real biological datasets.
B Analysis of yeast mitochondrial genomes. A collaboration with the research group of prof Nosek at the faculty of natural science to study evolution of mitochondrial genomes of pathogenic yeasts, with the focus on rearrangements.
C Supporting activities. Setup and development of a computational environment necessary for bioinformatics research, recruitment and supervision of students, and teaching activities supporting the research in this proposal.

We have developed new algorithms for comparative genomic analysis of complex duplicated regions (goal A): an algorithm for reconstruction of evolutionary histories of gene clusters (Vinar et al., 2010), an artificial simulation framework, and an algorithm for automated segmentation of gene clusters (Brejova et al., 2011a).

We have applied these algorithms to analyse the evolutionary history of the alpha defensin gene cluster in the primate genomes (orangutan genome sequencing consortium, 2011) and we have also investigated other theoretical and algorithmic problems stemming out of this research (Brejova et al., 2011b; Kovac et al., 2012). Together with our collaborators from Penn State university and national human genome research Institute, we have become members of an ongoing collaboration for sequencing and analysis of biomedically important complex gene clusters. We are working on improved algorithms for gene cluster analysis through more efficient MCMC sampling, and on making our prototype software tools available to a wider community.

In collaboration with Dr Luptak at the university of California at Irvine, we have developed a new sequence analysis algorithm and a software for RNA motif search (Jimenez et al., 2012) that is currently being applied in biochemical research on ribozymes. We have used our experience in RNA motif search to develop a similar framework for contact-rich protein domain search (Macko et al., 2013). We have also studied theoretical and practical problems relevant to annotation of alternative splicing (Kovac et al., 2009) and gene finding in novel genomes (Brejova et al., 2009). Finally, we have continued developing software for identification of gene orthologs and methodology for studying positive selection, which we have applied in several international projects (panda genome sequencing and analysis consortium, 2010; orangutan genome sequencing consortium, 2011; The western painted turtle genome consortium, 2013; The marmoset genome sequencing and analysis Consortium, 2013).

In collaboration with the laboratory of comparative and functional genomics of eukaryotic organelles (prof Nosek, goal B) and with Dr Brejova at the department of computer science, we have analysed eight newly sequenced mitochondrial genomes of pathogenic yeasts, with focus on their phylogeny and rearrangement history (Valach et al., 2011). To this end, we have developed a novel algorithm and software for analysis of rearrangement histories based on double-cut-and-join rearrangement model (Kovac et al., 2011a) and we also studied several theoretical problems in this area (Kovac et al., 2010, 2011b; Jahn et al., 2012).

Within the supporting activities (goal C), we have succesfuly established a computational biology research group at the faculty of mathematics, physics, and informatics that comprises two principal investigators (Dr Vinar and Dr Brejova) and 16 students at all levels of studies (bachelor's, master's, doctolar).

Dr Vinar currently supervises research projects of three doctoral students (Jakub Kovac, Martin Macko, Martin Kravec), and five master students; two bachelor and five masters theses have been completed within the scope of this project so far. The research group maintains stable research collaborations with scientists from Austria, China, Germany, Canada, and the Unites States (US). With contribution from a separate grant to Dr Brejova, we have built a research computing cluster that supports the activities of the research group.

To support our research activities, we maintain a weekly seminar on recent topics in computational biology, as well as regular lab meetings.

In collaboration with Dr Brejova (dept of computer science), prof Nosek, and prof Tomaska (faculty of natural sciences), we have developed two courses covering the area of computational biology (\methods in bioinformatics' and \genomics'), and we have started preparations for establishing bioinformatics degree program. We organise common seminars and summer schools. These educational activities (goal C) will ensure sustainability of research in computational biology at our institution.

References

Brejova, B., Burger, M., and Vinar, T. (2011a). Automated segmentation of DNA sequences with complex evolutionary histories. In Przytycka, T. M. and Sagot, M.-F. editors, algorithms in bioinformatics, 11th international workshop (WABI), volume 6833 of lecture notes in computer science, pages 1{13, Saarbrcken, Germany. Springer.

Brejova, B., Landau, G. M., and Vinar, T. (2011b). Fast computation of a string duplication history under no-breakpoint-reuse. In Grossi, R., Sebastiani, F., and Silvestri, F., editors, string processing and information retrieval (SPIRE), volume 7024 of lecture notes in computer science, pages 144{155, Pisa, Italy. Springer.

Brejova, B., Vinar, T., Chen, Y., Wang, S., Zhao, G., Brown, D. G., Li, M., and Zhou, Y. (2009). Finding genes in Schistosoma japonicum: annotating novel genomes with help of extrinsic evidence. Nucleic acids research, 37(7):e52.

Jahn, K., Zheng, C., Kovac, J., and Sanko, D. (2012). A consolidation algorithm for genomes fractionated after higher orderpolyploidisation. BMC bioinformatics, 13(Suppl 19):S8.

Jimenez, R. M., Rampasek, L., Brejova, B., Vinar, T., and Luptak, A. (2012). Discovery of RNA motifs using a computational pipeline that allows insertions in paired regions and filtering of candidate sequences . In Hartig, J. S., editor, Ribozymes: methods and protocols, volume 848 of methods in molecular biology, chapter 10, pages 145{158. Springer.

Kovac, J., Braga, M. D. V., and Stoye, J. (2010). The problem of chromosome reincorporation in DCJ sorting and halving. In Tannier, E., editor, comparative genomics - International workshop, Recomb-CG, volume 6398 of lecture notes in computer science, pages 13{24, Ottawa, Canada. Springer.

Kovac, J., Brejova, B., and Vinar, T. (2011a). A practical algorithm for ancestral rearrangement reconstruction. In Przytycka, T. M. and Sagot, M.-F. editors, Algorithms in Bioinformatics, 11th International workshop (WABI), volume 6833 of lecture notes in computer science, pages 163{174, Saarbrcken, Germany. Springer.

Kovac, J., Vinar, T., and Brejova, B. (2009). Predicting gene structures from multiple RT-PCR tests. In algorithms in bioinformatics (WABI), volume 5724 of lecture notes in bioinformatics, pages 181{193. Springer.

Kovac, J., Warren, R., Braga, M. D. V., and Stoye, J. (2011b). Restricted DCJ model: rearrangement problems with chromosome reincorporation. Journal of computational biology, 18(9):1231{1231.

Kovac, P., Brejova, B., and Vinar, T. (2012). Aligning sequences with repetitive motifs. in information technologies - Applications and theory (ITAT), pages 41{48. Best paper award.

Macko, M., Kralik, M., Brejova, B., and Vinar, T. (2013). OB-fold recognition combining sequence and structural motifs. Submitted, under review. orangutan genome sequencing consortium (2011). Comparative and demographic analysis of orangutan genomes. Nature, 469(7331):529{533.

panda genome sequencing and analysis consortium (2010). The sequence and de novo assembly of the giant panda genome. Nature, 463(7279):269{392.

The marmoset genome sequencing and analysis consortium (2013). The genome of the common marmoset: a comparative analysis of an extraordinary south American primate. Submitted, under review.

The western painted turtle genome consortium (2013). The western painted turtle genome: The evolution of extreme physiological adaptations in a slowly evolving lineage. Submitted, under review.

Valach, M., Farkas, Z., Fricova, D., Kovac, J., Brejova, B., Vinar, T., Pfei er, I., Kucsera, J., Tomaska, L., Lang, B. F., and Nosek, J. (2011). Evolution of linear chromosomes and multipartite genomes in yeast mitochondria. Nucleic acids research, 39(10):4202{4219.

Vinar, T., Brejova, B., Song, G., and Siepel, A. C. (2010). Reconstructing histories of complex gene clusters on a phylogeny. Journal of computational biology, 17(9):1267{1279.