One of the fundamental principles in comparative genomics is that evolutionary conservation implies functional importance. This principle has been used extensively in the life sciences, as a tool to detect biological functionality. In the vast majority of studies, evolutionary conservation has been sought in three biological “entities”: sequence (actual nucleotide or amino-acid sequences, as well as sequence properties like codon usage bias, GC content, and amino-acid bias); structure (like RNA, DNA and protein secondary and tertiary structures); and genome architecture (like synteny, gene order, relative gene orientation). These strategies have worked very well, but evidence is rapidly accumulating that there is a lot of “function without detectable conservation”. Many factors can potentially contribute to such “undetected functionality”, such as functions that do not depend on sequence or structure (for instance, nonsense mediated decay is affected by the mere location of the introns), functions that depend on very short and redundant sequence motifs, co-evolution of factor and target, and migration of functional sequences along the chromosome. We suggest a novel strategy to identify parts of these “undetected functions”. We study how to quantify the conservation in a fourth type of biological “entity”, that we dub gene architecture, and which is derived from the exon-intron structure of the gene. To this end, we are building a eukaryotic gene architecture database, and developing an algorithm to quantitatively evaluate the level of architectonic conservation in the different regions of the gene. Preliminary analysis reveals that conservation of gene architecture is plentiful, and in many cases it can be detected over very long evolutionary times (more than a billion years). This allows us to identify genic elements, mostly introns and splice junctions, which are likely to be of functional importance.
Call for proposal
See other projects for this call