Skip to main content
European Commission logo print header

ADVANCED METHODS FOR EVOLUTIONARY SEQUENCE ANALYSIS

Final Report Summary - AMESA (ADVANCED METHODS FOR EVOLUTIONARY SEQUENCE ANALYSIS)

Sequence alignment is widely used in molecular biology. Despite its age, the challenge is still not fully resolved: no method can suit all tasks, new approaches are needed for the evolving questions and even traditional methods can still be improved. Although many alignment tasks are related, some are based on incompatible principles and their need for distinct tools is not always understood. Evolutionary sequence alignment, the focus of the AMESA project, is a pre-requisite for all comparative analyses and needed e.g. in agricultural and medical research.

AMESA aimed at developing new methods for evolutionary and comparative sequence analysis using a novel approach of modelling data and considering evolutionary information. These methods targeted two current trends in sequence analysis, the increasing size of datasets and the specific properties of data produced on next-generation sequencing (NGS) platforms, from several different angles. The main outcomes of the project are two analysis tools: PAGAN (Löytynoja et al., 2012), a multiple sequence alignment program for phylogeny-aware de novo alignment and alignment extension; and Wasabi (Veidenberg et al., 2016), a graphical environment for evolutionary sequence analysis and data sharing. Of these, PAGAN is ultimately meant to replace my earlier method PRANK (Löytynoja & Goldman, 2005, 2008) as the method of choice for evolutionary analyses. Wasabi, on the other hand, is an easy-to-use graphical interface to PAGAN and other analysis methods, and provides an intuitive environment for sequence analysis tasks and data sharing.

PAGAN represents algorithmic development and forms the foundation of alignment work in my research group. We have so far developed two analysis programs built on top of PAGAN: Seance (Medlar et al., 2014) for reference ­based phylogenetic analysis of marker gene data such 18S rRNA; and Glutton (Medlar et al., submitted) for processing and multiple sequence alignment of transcriptome data from non-model organisms. We continue working on PAGAN and will e.g. extend it for the alignment of fragmented genomic data produced by de novo assembly programs.

While algorithmic development is highly valuable, the impact of Wasabi is more visible. Wasabi originally started as an interface to alignment program PRANK and PAGAN, but we soon realised that its network transparency and abilities in data visualisation can be exploited in other contexts. Being a cloud-based app built on common web-standards, Wasabi can be accessed from any web-connected device that has a modern web browser, including smartphones and tablets. As a centralised service, Wasabi inherently supports data sharing and collaboration. Finally, Wasabi is designed such that it helps to generate reproducible results and can easily be extended with additional analysis tools. We continue working on Wasabi and plan to extend the current analysis platform into a generic environment for sequence analysis, data visualisation and sharing.

Wasabi can be found at http://wasabiapp.org. All other software and the details of the AMESA project are available via http://loytynojalab.biocenter.helsinki.fi.