Understanding how the human genome works: major international project publishes results
An international team of researchers has unveiled that the human genome likely has more than twice as many genes as previously thought, many of which might play a role in human disease.
Following a decade-long in-depth study of all available date on gene activity, 30 research papers have been published from the GENCODE consortium researchers who hail from Spain, Switzerland, the United Kingdom and the United States. The GENCODE consortium is part of the ENCODE ('Encyclopedia of DNA Elements') project, which was launched following the completion of the Human Genome Project in 2003.
The papers have been published as part of a special multi-publisher collaboration between the journals BioMed Central, Nature, and Genome Research.
Although the Human Genome Project was an incredibly successful enterprise, part of the puzzle remained unsolved. The aim of ENCODE was to identify and describe all functional regions of the human genome sequence, and describe comprehensively all the active regions of our human genome. ENCODE shifted the research focus from generating sequences to annotating the functional elements hidden within the human genome's 3.2 billion As, Cs, Gs and Ts.
These newly published studies describe more than 10 000 novel genes; they identify genes that have 'died' and others that are being 'resurrected'.
Dr Jennifer Harrow, GENCODE principal investigator from the Wellcome Trust Sanger Institute in the United Kingdom, comments: 'We have uncovered a staggering array of genes in our genome, simply because we can examine many genomes in ... detail that was not possible a decade ago. As sequencing technology improves ... we have much more data to explore.
'But our work remains a skilled effort to annotate correctly our human genome - or, more precisely, our human genomes, for each of us differ. These vast texts of genetic information will not give up their secrets easily. GENCODE has made amazing strides to enable immediate access of its reference gene set by other researchers.'
Among their findings are genes that do not contain genetic code to make proteins - non-coding genes - and the graveyard of supposedly 'dead' genes from which some are emerging, resurrected from the catalogue of pseudogenes. The researchers mapped and described 9 277 long non-coding genes; these are a relatively new type of gene that acts, not through producing a protein, but directly through its RNA messenger.
Long non-coding RNAs derived from these genes can play a significant part in human biology and disease, but they remain poorly understood.
The new map of such genetic components gives scientists more avenues to explore in their quest to understand human biology and human disease. Many of the researchers believe that their job is not complete, and that there may be another 10 000 of these genes yet to be uncovered.
Professor Roderic Guigo, GENCODE principal investigator from the Centre for Genomic Regulation, Barcelona, Spain, comments: 'Our initial work from the Human Genome Project suggested there were around 20 000 protein-coding genes and that value has not changed greatly. However, GENCODE has shown that long non-coding RNAs are far more numerous and important than previously thought.
'The limited knowledge we have of the class of long non-coding RNAs suggests they might play a major role in regulating the activity of other genes. If this is generally true of this group, we have much more to explore than we imagined.'
The researchers will be updating the GENCODE human reference set every three months to ensure that models are continually refined and assessed based on new experimental data deposited in the public databases.
Data Source Provider: Wellcome Trust Sanger Institute
Document Reference: Based on information from the Wellcome Trust Sanger Institute
Subject Index: Life Sciences; Medicine, Health; Scientific Research; Social Aspects