Skip to main content
European Commission logo print header
Contenido archivado el 2024-04-15

KNOWLEDGE-BASED MODELLING AND DESIGN ON NOVEL PROTEINS

Objetivo


There is strong evidence that proteins belong to a limited number of families, each with a common fold. The objective of this project is to define all those sequences that can assume a known protein fold.
Tables have been developed which predict the amino acid substitutions possible for a residue with a particular conformation, solvent accessibility and sidechain hydrogen bonding pattern. These have been used to develop profiles (tertiary templates, consensus sequences, key residues) for protein fragments selected on geometric grounds from the Brookhaven Databank (London).

Loop selection procedures need to take into account not only the geometric fit of the fragment to the framework but also the amino acid sequence of the fragment. Templates were constructed from environmentally classified residue substitution patterns derived from comparative analyses of families of structurally aligned homologous proteins. Tests were then conducted to compare predicted patterns of residue substitution with observed sequence variability for key conserved residues identified using interactive graphics in 2 beta hairpin families, and the L1 and L3 hypervariable regions of the immunoglobulins. A profile of probability of residue acceptance is generated at each position in a loop fragment and scored against the sequence to be modelled. The mutation matrices have been applied in test searches for hairpin loop regions in human renin. Higher mean probability scores are found for selected fragments which are similar to those in renin.

There is strong evidence that proteins belong to a limited number of families, each with a common fold. The possibility arises that the 3-dimensional structures of proteins may be modelled if the sequence can be associated with a family of proteins, 1 or more of which has a known 3-dimensional structure.

Protein models were constructed from protein fragments representing the conserved mainchain regions, the variable mainchain regions and the sidechains. The fragments were selected from the Brookhaven Databank from homologous proteins (for conserved regions), from all proteins (for variable regions) and using rules for sidechain substitution derived from aligned structures of families of homologous proteins. In order to assess errors, models were constructed from proteins of known 3-dimensional structure; for example 1 phospholipase A2 structure was modelled from the other 2 known structures for each of the 3 known phospholipase structures. Furthermore the model of human renin was constructed using pepsin and chymosin (for conserved regions) and all proteins in databank for the variable regions. The structures of both human and mouse renins have been determined by X-ray crystallography. The models and experimental structures have been retrospectively compared. The academic version of COMPOSER (freely available to academic laboratories) has been used to build the models and for testing new procedures.

Procine phospholipase A2 was modelled on the bovine and rattlesnake enzyme structures. Compared to the crystallographic structure, the model has a root mean square diameter (RMSD) of 0.63 angstroms. Model building of the rattlesnake enzyme from the 2 mammalian structures yields a structure with a RMSD of 1.68 angstroms for the mainchain C-alphas. Errors in the positions of C-alpha atoms were shown to be highly correlated with the 14 angstroms reciprocal contact number for the C-alphas in the final model. Models have also been built of several plant cysteine protein ases (papaya proteinase omega, stem bromelin, chymopapain M (papaya proteinase 1V) and a B-type chymopapain) from the known structures of papain and actinidin.

Human renin had been the subject of modelling on previous occasions. In all cases a single aspartic proteinase structure had been used for the conserved regions, and replacements, insertions and deletions were made using interactive graphics. A new model was constructed using the highly refined structures of pepsin and chymosin defined at Birkbeck.

Previous protein modelling procedures based on selection of protein fragments from a database have not involved a completely general procedure for recognising whether the sequence to be modelled is compatible with the structure selected. For certain specific cases, such as immunoglobulin variable regions, rules for key residues have been derives. Our procedure uses substitution tables to indicate key residues automatically.
The procedure was tested by modelling proteins that have known 3 dimensional structures, in particular human and mouse renins whose X-ray analyses were completed at Birkbeck in 1991. Analysis shows that the rule based model of 1990 is considerably closer to the real structure than an interactive graphics model completed in 1984.

Tema(s)

Data not available

Convocatoria de propuestas

Data not available

Régimen de financiación

CSC - Cost-sharing contracts

Coordinador

BIRKBECK COLLEGE
Aportación de la UE
Sin datos
Dirección
MALET STREET
WC1E 7HX LONDON
Reino Unido

Ver en el mapa

Coste total
Sin datos