Skip to main content
European Commission logo print header

Domain integrity verification of alternative splicing using statistical methods and molecular modeling


The human genome has only about 24,000 protein-coding genes, considerably less than it was previously suspected. Much of the complexity of higher organisms can be attributed to alternative splicing (AS), and more than 70% of human genes are alternatively spliced. However, the number of functional proteins in humans, mice or Drosophila is still entirely uncertain as the putative proteins are identified from nucleic acid sequence information.
As there are several databases of AS that contain putative alternative splices assembled from the available EST- and cDNA-based evidence, we are planning to screen these AS isoforms as to whether they have full-length domains and if they can be folded into a stable 3D structure. In the first step we will filter the isoforms with our domain database, which contain only domains, derived from Pfam, that are always full-length (90% of the available Pfam families). In the next step we will mine the annotations of those human Swissprot proteins that have both AS and 3D-structure information, supplemented with orthologous proteins from other organisms and tissue-specific information to create a test set where we can determine with a high level of confidence if the observed splice variants exist in nature or not. We are going to use this test set to generate probabilistic models where we can calculate the probability if a certain AS isoform can exist as a stable isoform. The test set will be also used in evaluating the AS isoform models by structural calculations focusing on the newly discovered connection between alternative splicing and protein disorder. Some calculations will be performed at the High Performance Computing Centre of the University of Szeged in collaboration with theoretical physicists. The added value of this multidisciplinary approach will be hopefully both a better understanding of alternative splicing and the refining of the force fields and local effects used in the molecular modeling of disordered proteins.

Call for proposal

See other projects for this call


EU contribution
No data
Karolina ut 29.

See on map

Total cost
No data