The aim of this project is to develop open source tools enabling to identify in silico potential cis-regulatory modules, as well as the transcription factors (TF) that bind to them. A publicly available genomic resource describing TF across chordate geno mes, including mammals, fish and ascidians, will be developed. Based on this resource, large-scale sequence comparisons of orthologous non-coding regions of TF genes will be performed. Moreover, a comprehensive in-vitro study of DNA-binding affinity of T F proteins will be used to build novel models of TF binding sites. Multidisciplinary approaches will be developed to identify the cis-regulatory regions driving TF gene expression. The basic steps undertaken will be: 1.Genome wide census and phylogenetic analysis of all TFs in sequenced chordate genomes. 2.Genome wide identification of Multi Species Conserved Sequences (MCSs) within orhtologous regulatory regions of chordate TFs. 3.Characterization of the activity of the identified MCSs in Ciona and zebra fish embryos, mammalian cells as well as in transgenic mice. 4.Training of a novel algorithm to predict and characterize MCSs active in various model systems 5.Comprehensive determination of the in vitro DNA-binding specificity of all transcription factor s in a chordate genome 6.Building novel bioinformatics models of TF binding sites based on complex grammars such as Hidden Markov Models and Stochastic Context Free Grammar 7.Integration of the above data into a publicly available, open source bioinformat ics tool that can be used either via the project website or downloaded for large-scale projects This project is a large-scale pluri-disciplinary effort to decipher the grammar of chordate regulatory sequences,and will have a strong impact by building too ls and resources that will enable to devise more sophisticated hypotheses regarding regulatory networks, especially those of TFs which are involved in fundamental biological processes.
Call for proposal
See other projects for this call