Common access to immunogenetics data: the integrated immunogenetics database


The ImMunoGeneTics database, IMGT, is an integrated specialised database containing nucleotide sequence information of genes important in the function of the immune system. It collects and annotates sequences belonging to the immunoglobulin superfamily which are involved in immune recognition, these are the Immunoglobulins (Ig), the T cell antigen receptors (TcR) (LIGM-DB) and the Major Histocompatibility Complex (MHC). In humans, the latter are referred to as the Human Leucocyte Antigens (HLA) system (HLA-DB). Collection and analysis of such sequences is important for the understanding of disorders of the immune system such as infectious diseases, autoimmunity, allergy or tumour development, and for the development of antibody engineering, transplantation and immunotherapy.
IMGT works in close collaboration with the EMBL database and uses sequence information from the EMBL database for expert annotation. As the EMBL database is a generalist database, it collects sequences of all types from all species, there are insufficient resources for specialised information to be added. In IMGT however such information is added in a consistent and standardised way by expert annotators. This annotation procedure includes verification of the sequence data by homology studies using multiple sequence alignment tools, and also labelling of the sequence, which cannot be carried out at the generalist databanks. By updating the information held at the generalist databases from the specialist databases, the quality of the information will improve. This project requests support for the European partners involved in the expansion phase of the IMGT database. LIGM (France): project coordinator, responsible for the IMGT database design and structure, software development for sequence annotation and multiple alignments, annotation of TcR and Ig sequences, and integration of genetic mapping data in IMGT. EMBL-EBI (UK): IMGT database integration, interfacing with the EMBL database, database development and distribution, network interfacing and protein specific annotations for IMGT- TREMBL. ICRF (UK) and BPRC (The Netherlands) collection and annotation of HLA sequences and other MHC encoded genes, generation of alignment tables, and database development. IFG (Germany): software development for multiple sequence alignments and fast sequence searching, development of alignment tables, annotation of Ig sequences. EUROGENTEC (Belgium): constitution of an oligonucleotide primer database, application tools for fundamental, pharmaceutical and clinical purposes. The aims at the end of the 3 year project are to set up efficient IMGT sequence submission tools at EMBL-EBI, so that the authors are able to provide annotations themselves, to establish a permanent structure at LIGM and EMBL-EBI for the quality control, data distribution, and further IMGT development, to establish freely available common data access to all immunogenetics data and to provide a graphical user friendly data access. Europe has taken a lead in the development of an integrated database which is unique in the world. The data distribution are at the forefronts of technology. By management of the enormous complexity of the immunogenetics data, IMGT will be of immense value for Biology, Biotechnology, Medicine as well as for the European Pharmaceutical Industry, and will considerably help to strenghten Europe's role in these areas.

