Objective
AVAILABILITY OF A TOOL FOR THE SOLUTION OF PRACTICAL PROTEIN DESIGN PROBLEMS IN MOLECULAR BIOLOGY, MOLECULAR MEDICINE AND BIOTECHNOLOGY, WITH LONG TERM ECONOMIC BENEFITS IN THESE FIELDS.
Data on known protein structures and amino acid sequences have proven to be very useful for deriving empirical rules for protein folding and design. With the growing volume of these data, more sophisticated systems for storing and handlingknowledge on macromolecular structures are urgently needed. Progress should be made by improving ways to exploit sequence homology to infer structural information from the more than 10000 proteins for which only sequence data are available.
Research was carried out in order to develop a database of protein knowledge containing structure and sequence information and extend the database to include information on inferred 3-dimensional structures of proteins for which only sequence data are available.
SESAM, a performing relational database for protein structure and sequence capable of containing data from public and private sources, was developed; it features powerful procedures for validating and cleaning up input data, and rapid data retrieval. It has been interfaced with a graphics package, BRUGEL, and specialized user friendly interfaces have been implemented. The database on known protein structures was extended to include inferred 3-dimensional structures, grouped into structural families, by exploiting the correlations between structural homology and sequence similarity above a certain threshold of the latter. A limited number of short sequence patterns characterising with high accuracy local structure motifs in proteins can be found. This does not improve protein structure prediction methods, due to the limited size of the structural database and to the influence of spatial interactions between distant residues in the sequence. Object oriented methods and logic programming (Prolog) yield important benefits in terms of speeding up the design, development and debugging stages.
DEVELOPMENT OF A SYSTEM FOR PROTEIN STRUCTURE PREDICTION, MERGING STATISTICAL AND INFORMATION ANALYSIS TECHNIQUES WITH ADVANCED TOOLS FROM COMPUTER SCIENCE (A.I.) AS WELL AS EXISTING METHODS IN MOLECULAR MODELLING.
IN THIS COLLABORATIVE EFFORT, THE SPECIFIC ROLE OF THIS PART OF THE PROJECT CONSISTS IN THE DEVELOPMENT OF INFERENCE ALGORITHMS AND ANALYSIS OF DERIVED STRUCTURAL AND SEQUENCE PATTERNS USING MOLECULAR GRAPHICS AND COMPUTER SIMULATIONS.
Fields of science
- natural sciencesbiological sciencesbiochemistrybiomoleculesproteinsprotein folding
- natural scienceschemical sciencesorganic chemistryamines
- natural sciencescomputer and information sciencesdatabasesrelational databases
- natural sciencesmathematicsapplied mathematicsmathematical model
- natural sciencesbiological sciencesmolecular biology
Programme(s)
Topic(s)
Data not availableCall for proposal
Data not availableFunding Scheme
CSC - Cost-sharing contractsCoordinator
1050 BRUSSELS
Belgium