The aim was to produce a coordinated and linked set of software modules integrating a number of existing as well as new procedures and protocols for recording. communicating and validating the models resulting from 3D structural studies on peptides, proteins, nucleic acids, and other molecules that act as ligands and co-factors. These models were to be be based on the two important experimental tachniques, protein crystallography and NMR, or be the result of theoretical modelling calculations. The packages were intended to provide assessment of the intrinsic quality of the experimental data and the agreement between such data and the derived atomic models.
MAJOR SCIENTIFIC BREAKTHROUGH:
This EC contract has brought together in a complementary manner several established European laboratories working in the field of macromolecular 3D coordinate provision and analysis. The insights provided into each others practice, working approaches and problems have been invaluable. Advances have been made in key areas. Experimental data have been made available to atomic resolution for and increasing number of proteins. Techniques and protocols for refining models against these and other data have been improved. Statistical criteria for the validation of protein models and associated data have been considerably tightened. The methods conventionally developed for validation of X-ray models have been extended to NMR. The software developed has been made available to the community via central computer servers.
In addition to eh release of software, the insights gained have been disseminated through workshops, conference presentations and publications. Furthermore the Partners organised 2 workshops involving the international community on the new mmCIF definitions. These are vital for the storage and retrieval of more sophisticated macromolecular structural experiments. A third general workshop covered advances in data quality and refinement and discussed the current limitations.
The project has culminated in a publication analysing the fit between the validation suites and the first eight atomic resolution structural models.
In summary, the results attained have achieved the initial objectives.
Partner 1 has demonstrated it is possible to analyse crystal structures at atomic resolution (AR) from small tightly packed proteins to a rapidly increasing set of larger more typica ones. This has influenced a substantial number of other laboratories both within Europe and world-wide. The X-ray data and coordinates are now reaching the wider community. Automated techniques for refining structures, especially solvent regions, have been developed. With other Partners, the importance of good quality data at low as well as high resolution has been reinforced. Analysis of eight AR structures was performed with Partners 2, 3, 4 and 6. Recently analysis of crambin with Martha Teeter led to the first observation of deformation (bonding electron) density in a protein.
Partner 2 concentrated on the general release of a new software package, REFMAC, for the refinement of models against X-ray data using maximum likelihood. REFMAC has been released to the community through the CCP4 at Daresbury Laboratory and is widely used. It has already led to a substantial number of publications from members of the contract and others. Its is especially powerful in combination with ARP developed by Partner 1. Assessment of information content of X-ray data has been addressed through the Diffraction Precision Indicator (DPI).
Partner 3 developed statistical analysis of coordinate sets in the PDB and AR sets provided by the network Partners. The PROCHECK program was enhanced, in particular to handle such features as multiple conformations. This Partner worked closely with Partner 5 to extend the PROCHECK software to NMR models.
Partner 4 firstly developed a new criterion for the validation of 3D structural models. This is based on the Voronoi analysis of packing volumes. Atoms in macromolecules should conform to the normal rules of close packing observed for small molecule structures. The program PROVE identifies atoms which deviate from this and warrant further inspection. Secondly the program SF-CHECK was written to validate 3D models directly against experimental X-ray data. Finally how to generalise the ligand library for modelling and refinement is being addressed in collaboration with Partner 2.
Partner 5 investigated the validation of experimental NMR data and derived models. The program AQUA directly assesses the validity of the models vs. the NMR data. In collaboration with Partner 3, PROCHECK was extended to handle ensembles of NMR models. This work benefited tremendously from the interactions between the Partners.
Partner 6 extended the development of the WHATIF validation software. A subset was extracted for distribution via the server developed by this Partner and installed at EMBL Heidelberg, the EBI and the PDB, which also supported the PROCHECK and PROVE packages from Partners 3 and 4. An extensive analysis of anomalies in the models currently in the PDB was performed. A large number of errors in conventions were identified, together with a subset of more serious problems.
Before this contract, some of the sofware packages described were already under development. The contract has forged strong links between the groups leading to packages which are appreciably more powerful.
New insights have been proveded into protein geometry. All Partners established either new data or novel software which is freely available to the community via the WWW or through servers at several key sites. Validation of X-ray structures and data has been extended to NMR models.
Partners 1 and 2 made the ARP and REFMAC code freely available via the CCP4 macromolecular crystallography suite based at Daresbury, UK. These two modules are highly complementary and reflect a synergy between the groups as a result of the contract. Similarly, Partners 3, 4 and 6 have established their sofware modules for coordinate validation via the server developed by Partner 6, at PDB, EMBL Heidelberg and the EBI. these modules now share elements of graphical output structure as a direct result of this contract, and provide complementary analyses of models. SF-CHECK from Partner 4 directly addresses the relation between data and model, and its development is continuing in collaboration with Partner 1. It is currently under b-release (test phase) and will then be established at CCP4.