European Commission logo
polski polski
CORDIS - Wyniki badań wspieranych przez UE

European Network for Genetic and Genomic Epidemiology

Final Report Summary - ENGAGE (European network for genetic and genomic epidemiology)

Executive Summary:

The past decade was a remarkable era for human genetics and genomics. After the completion of the Human Genomic Project, tremendous efforts have been invested to translate the new information provided by human genome into knowledge that can benefit human health. With rapid development of technology, human genetic researchers are able to characterise the mechanisms underlying many diseases to a profound extent that was not possible previously. One feature of this recent scientific explosion has been the international collaboration fostered by realisation that the power of individual studies is limited.

The European Network for Genetic and Genomic Epidemiology (ENGAGE) aimed at translating the wealth of data emerging from large-scale research in genetic and genomic epidemiology from European (and other) population cohorts into information relevant to future clinical applications. The concept of ENGAGE was to enable European researchers to identify large numbers of novel susceptibility genes that influence metabolic, behavioural and cardiovascular traits and to study the interactions between genes and life style factors.

Collectively, members of the ENGAGE consortium have access to an extensive range of well phenotyped and catalogued European population cohorts representing more than 600 000 subjects. In addition to clinical phenotypes, many of these subjects have provided biosamples that can be used for genetic and other genomic studies; over 100 000 of these subjects have had genome wide association (GWA) analyses completed.

The collaboration fostered through the award established an international framework that has allowed us to:

1. bring together expertise to develop research and analytical strategies;
2. develop new computational tools to support data sharing and the harmonisation of cohort phenotypes;
3. integrate and analyse large-scale datasets with robust statistical tools;
4. uncover the effects of the complex interactions of genes, environment and lifestyle factors on disease risk;
5. support and explore potential in translating early findings to clinical applications.

Through inter-consortium initiatives and through collaboration with international consortia, ENGAGE partners have contributed to many of the largest genetic studies yet performed. These have used GWA approaches applied to many hundreds of thousands of individuals and have identified many hundreds of genetic loci influencing dozens of medically-significant traits, ranging from type 2 diabetes (Tand obesity, to smoking behaviour and birth weight. These genetic discoveries have provided crucial information to accelerate definition of the biological mechanisms through which disease comes about; in turn this will further catalyse early steps towards better ways of treating, monitoring, predicting and preventing disease.

ENGAGE has made best effort to document and disseminate such experience and provide recommendations for other consortia and funding decision bodies. ENGAGE partners have published over 250 scientific manuscripts (many of them with open access and in high impact journals), making the data and knowledge generated accessible to the scientific community. This has helped to maintain the competitiveness of European research excellence in the field of human genetic and genomic epidemiological research. The experience and knowledge obtained during the course of ENGAGE have been hugely beneficial for the field in general and have also contributed to the training and development of a cadre of junior researchers with the skills and scientific temperament to support future projects of this kind.

Project Context and Objectives:

Project Context

The past decade was a uniquely exciting time in human genetics as rapid advances in genomic technology enabled deeper characterisation of the mechanisms underlying many human diseases. One feature of this rapid scientific development has been the international collaboration fostered by realisation that the power of individual studies is limited. To identify the full range of genetic variation contributing to common disease and to uncover the effects of the complex interactions of genes, environment and lifestyle factors on disease risk and thereby support translational advances, a more inclusive approach, based on epidemiological principles, was required.

Collectively, members of the ENGAGE consortium have access to an extensive range of well phenotyped and catalogued population cohorts representing more than 600 000 subjects. GWA are available for more than 100 000 of these subjects and an early goal of the ENGAGE project was to bring together these datasets to perform large scale integrated genetic association analyses. Adopting this approach has allowed the consortium to rapidly identify novel disease-susceptibility variants undetectable in individual studies. A key ENGAGE objective was to evaluate the clinical and public health relevance of the novel disease and trait-susceptibility genes identified and to demonstrate that these findings can be used as diagnostic indicators for common diseases helping us to better understand risk factors, disease progression and why people differ in responses to treatment.

ENGAGE activities were organised through 10 work packages (WPs) with WP1, WP2 and WP3 to encompass initial discovery studies using available GWA data, new genotypic and new phenotypic data, with WP5 and WP6 to explore key biological questions through refinement and epidemiological approaches, with WP7 to attempt to translate key finding to clinical arena and with WP4, WP8, WP9 and WP10 to provide a comprehensive information technologies (IT)/bioinformatics, ethical, training and management framework to facilitate these activities.

List of ENGAGE work packages and leaders:

1. WP1: Genome Wide Data Integration (Mark McCarthy)
2. WP2: Novel sources of Genome-wide Variation (Xavier Estivill)
3. WP3: Novel Phenotypes (Gert-Jan van Ommen)
4. WP4: Informatics and Bioinformatics (Alvis Brazma)
5. WP5: Genetic Refinement of Identified Loci (Kari Stefansson/Unnur Thorsteinsdottir)
6. WP6: Epidemiology and Joint Effects (Nancy Pederson)
7. WP7: Clinical Translation (Leif Groop)
8. WP8: Societal Aspects (Jennifer Harris)
9. WP9: Training and Dissemination (Jaakko Kaprio)
10. WP10: Coordination (Leena Peltonen/Mark McCarthy)

Overall Objectives:

The overall objectives of ENGAGE are:

1. to develop an enhanced supranational framework for research into genetic and genomic epidemiology that assembles the best researchers, the best sample and data sets in areas of primary focus (cardiovascular, metabolic, behavioural), the best ethical guidance and the best analytical and translational platforms;
2. to accelerate discovery of disease-susceptibility genes through integrated analyses using multiple large-scale data sets and a range of experimental designs, thereby identifying novel aetiological pathways (with potential for pharmaceutical exploitation) and novel susceptibility variants and biomarkers (with potential as diagnostics as well as in guiding therapy development);
3. to translate these findings into the clinical arena;
4. to explore key methodological questions relevant to European research in this area (including for example, the consequences of ethnic and environmental heterogeneity for gene discovery efforts and the allelic architecture of common disease);
5. to develop novel technological and statistical approaches for the study of human disease; to disseminate research outputs to both the scientific and non-specialist audience
6. to contribute to international efforts in large population cohorts as exemplified by our very close contacts with the P3G effort (Public Population Projects in Genomics).

ENGAGE Partners: ENGAGE has brought together 24 leading research organisations and two biotechnology and pharmaceutical companies across Europe and in Canada and Australia. The project was led by the world-leading human geneticist Prof. Leena Peltonen from FIMM, University of Helsinki for the first 26 months, until March 2010 when she sadly passed away after a long illness. The project co-coordinator, Prof. Mark McCarthy from University of Oxford assumed the leadership as Scientific Coordinator since March 2010.

Project Results:

Main scientific and technical (S&T) Results/Foregrounds

Through its high level of synergy and integration ENGAGE has successfully achieved its primary aim to translate the wealth of data emerging from large-scale genetic and epidemiological studies from well-characterised population cohorts into information that is relevant to future advances in clinical practice.

ENGAGE has extended its joint genetic analyses to encompass additional sources of genome variation as methods improve for the large-scale collection and analysis of these data types (copy number variation, rare variants etc.) and to additional phenotypes as such datasets become available from ENGAGE partners. ENGAGE has also explored key methodological questions relevant to European research in genetic and genomic epidemiology and developing novel statistical approaches for data analysis.

During its lifespan ENGAGE partners have actively engaged and performed research activities and further efficiently made their research outputs available to the scientific community, for example through publications in scientific journals (over 250 peer-reviewed scientific manuscripts published, many of them with open access and in high impact journals) and through presentations in conferences. This has contributed to promoting visibility and maintaining the competitiveness of European research excellence in the field of human genetic and genomic epidemiological research. The experience and knowledge obtained during the course of ENGAGE have been hugely beneficial for the field in general and have also contributed to the training and development of a cadre of junior researchers with the skills and scientific temperament to support future projects of this kind. ENGAGE has made best efforts to document and disseminate such experience and provide recommendations for other consortia and funding decision bodies.

Develop an enhanced supranational framework for research into genetic and genomic epidemiology that assembles the best researchers, the best sample and data sets in areas of primary focus (cardiovascular, metabolic, behavioural), the best ethical guidance and the best analytical and translational platforms.

ENGAGE has established a dynamic framework across country borders that assembled world-renown expertise from the related fields and a constellation of genetic, genomic and phenotypic data and samples from an extensive range of populations. Key to the success of ENGAGE in risk marker identification and clinical translation are the ENGAGE objectives for data sharing and harmonisation. ENGAGE has developed new computational approaches supporting data sharing and the harmonisation of cohort phenotypes whilst establishing protocols for managing the ethical aspects of sample and data sharing according to informed consent, local ethical approval and the governance structures of each ENGAGE partner. A set of flagship projects was launched to enhance synergy and collaboration across related core activities. ENGAGE also recognised the importance of next generation experts for this rapidly developing field of genetic research. Through a productive ENGAGE Training Policy and Exchange and Mobility Programme, ENGAGE junior scientists were offered excellent opportunities to visit partner institutions or outside ENGAGE to enhance their skills and knowledge using results and developments emanating from the ENGAGE project and beyond.

ENGAGE samples and data

ENGAGE partners have access to up to 600 000 subjects in cohorts with various formats (including healthy, disease-based, twins/family-based, population isolates, prospective/cross-sectional cohorts) and at least 100 000 of these subjects with GWA data. To fully capitalise the resources available from these cohorts and foster epidemiological studies (e.g. GxG, GxE interactions), an extensive epidemiology survey was conducted within the scope of the 'Large-Scale Epidemiology Flagship Project'. In total, 38 ENGAGE cohorts representing over 250 000 subjects were included in this study and 150,000 samples were genotyped on a selected set of SNPs that were of ENGAGE main research interests.

Furthermore, also motivated largely through the flagship projects, new large-scale data sets have been generated in the interests of improving power and extending applicability to new research approaches. For example, metabolomic profiling covering 163 metabolic traits involving more than 7 000 samples from seven ENGAGE cohorts, produced through the 'Metabolomics Flagship Project' aims to identify new associations for common variants with weaker effect size and for rarer variants with moderate effective size. Telomere length (TL), a heritable trait associated with several diseases (e.g. CAD, cancer) and disease risk factors (e.g. oxidative stress, inflammation), has been measured for over 38,000 samples. The TL data together with GWA data were meta-analysed to identify genetic loci that affect telomere length and to assess the relationship of TL with age-related diseases and other phenotypes. Genome-wide deoxyribonucleic acid (DNA) methylation data has been generated for highly-informative data on trait-discordant MZ pairs.

Outreach to other international consortia

ENGAGE as a project and partners therein have deeply networked with several projects relevant to the area of study, as can be seen in the project deliverables.

Database and computational infrastructure for data sharing

To support large scale integrated analyses within ENGAGE, during the initial phase of the project, the GWAS integration and database teams worked to identify the data submission and exchange requirements needed The data submission system developed (SIMBioMS) enabled ENGAGE partners to share standardised data sets, in line with the data access policy and consent oversights established by WP8. SIMBioMS also facilitates data export to public data archives (e.g. EGA, ArrayExpress, PRIDE). Phenotype data for ENGAGE cohorts has been generated using a wide range of cohort-specific questionnaires, clinical protocols and technology platforms. A strategic collaboration between ENGAGE and the P3G Consortium has supported data harmonisation through mapping cohort-specific parameters onto controlled vocabularies: a web-based repository for these data (SAIL) developed by the database team has been used in ENGAGE and partner projects (e.g. Summit). These data harmonisation efforts are closely integrated with ongoing ESFRI activities (BBMRI, ELIXIR) to ensure the compatibility with European and global initiatives in this area. ENGAGE has further made efforts to map and document its data sharing experiences which can benefit other similar consortia at large-scale as well as funding agencies when considering data sharing principles (manuscript submitted).

Ethical Framework

The ENGAGE overarching aim has important social implications which have been well addressed within the context of the project. The ENGAGE ethical, legal and social implications (ELSI) team performed conceptual analyses on areas that were relevant to ENGAGE, for example, on convergence and divergence of ELSI when translating between population-based and clinical-based biobank studies; and on socio-ethical and legal approaches to policy issues in genome research.

ENGAGE has developed ELSI tools to promote inter-operability of large-scale projects in translational molecular epidemiology. More specifically, a policy paper was published to report ENGAGE experience on the mechanisms allowing the retrospective use of data and human biological samples. A confidentiality template specific to ENGAGE research as well as consent template to bridge consent between clinical and non-clinical research environment were formulated and published (i.e. through 'ENGAGE Principles for Data Sharing Data release and intellectual Property' and a research paper 'Bridging Consent: From Toll Bridges to Lift Bridges?' published in BMC Medical Genomics 2011, respectively). In collaboration with Public Population Project in Genomic (P3G) and the Centre for Health, Law and Emerging Technologies (HeLEX), an international Data-Sharing Code of Conduct for Genomic Research was developed and published. Recent work has been focused on analysing current recommendations for genotype-driven recruitment research design as genotypic information becomes more commonly available in clinical care and for individuals in direct to consumer testing. The strategies to address the key challenges identified have been reported by the ENGAGE ELSI team.

Training and mobility programme

It is imperative that European research groups obtain 'state of the art' training in modern genetic epidemiology and related fields relevant to population genomics and emerging 'omics'. To ensure that future expertise in this rapidly developing field is well obtained, ENGAGE established a 'Training Policy' and launched an 'Exchange and Mobility Programme' to offer young pre- and postdoctoral researchers training opportunities in areas, such as interdisciplinary genetics and statistical/epidemiological methods and others. The programme also consisted of inter-institute exchanges, thematic workshops and hands-on courses at the participating partner centres. Through the training programme, ENGAGE has supported 22 exchange and mobility visits and organised or co-organised at least 15 training workshops/courses/meetings. These workshops have been very well-received, while the exchanges have been essential to promoting research collaborations between partners.

Accelerate discovery of disease-susceptibility genes through integrated analyses using multiple large-scale data sets

ENGAGE has played a leading role (sometimes alone, often as part of wider consortia) in GWA meta-analyses which have identified many hundreds of genetic loci influencing dozens of medically-significant traits, ranging from type 2 diabetes (T2D) and obesity, to smoking behaviour and birthweight. These discoveries have often provided vital clues to the mechanisms influencing these phenotypes, catalysing early steps towards novel therapeutic and preventative options. The maturity and low experimental cost of GWA arrays has meant that these datasets were the first to be widely available across ENGAGE cohorts. The consortium moved effectively to synthesise such data and has been using similar approaches to mine additional sources of genomic variation (rare variants and copy number variants for example) as well as novel molecular 'omic' phenotypes (such as transcriptomic, epigenomic and metabolomic data). Several of these efforts have been highlighted as 'flagship' projects and new data are being generated.

Translate ENGAGE findings into the clinical arena

A key long-term objective of ENGAGE has been to move ENGAGE findings towards clinical translation. These efforts take several different forms including disease stratification, identification of genetic and non-genetic biomarkers, improved prognostication (e.g. diabetes complication risk) and pharmacogenetics.

Through strategies combining discovery, validation and first attempts to translation, the success in identifying novel genetic variants that increase risk of disease have generated important molecular and biological insights, leading to better understanding of disease classifications and pathogenesis.

For example, ENGAGE partners reported a comprehensive description of mechanism by which T2D associated gene variants including TCF7L2, MTRN1B, FTO, KCNQ1 cause altered glucose or lipid metabolism. In collaboration with other consortia through large-scale association studies, additional T2D susceptibility genetics variants have been identified. These findings have expanded the understanding of the genetic architecture and molecular basis of T2D, leading to potential risk stratification of T2D that can be used in future clinical practice. Similar efforts involving ENGAGE partners have also been made in other disease areas including cardiovascular diseases, lipids, metabolic syndromes and the complications of diabetes.

By means of a genome-wide scan with a robust statistical design, the ENGAGE lipids team (Lipidaction) evaluated the interactions between genes and epidemiological risk factors including lifestyle (e.g. smoking, alcohol consumption) and body composition, e.g. body mass index (BMI) and waist-to-hip ratio (WHR), for circulated serum lipid levels. The study identified a new genetic locus in chromosome 4p15 modifying the effect of WHR on total cholesterol (TC) (Surakka et al 2011). Although additional studies are required to further establish the interaction mechanism, this study might imply potential use for establishing targeted intervention strategies for people characterised by specific genomic profiles.

Within the ENGAGE Large-Scale Epidemiology Flagship Project, a Mendelian randomisation approach was applied to the FTO locus variant, where the flagship team have confirmed the causal relationship between obesity and multiple cardio-metabolic traits using FTO variant as an instrumental variable. In particular, the study has provided novel insights in the causal effect of obesity on heart failure and increased liver enzymes levels. These findings are relevant to public health in relation to the global prevention efforts for obesity and hence diseases associated to obesity such as T2D and heart disease (Fall et al, PLoS Med in revision).

ENGAGE has made progress towards applications in clinical diagnostics. The Oxford team has successfully identified C-reactive protein (CRP) as a biomarker for MODY (maturity-onset diabetes of the young), a monogenic form of diabetes different from both type 1 diabetes (T1D) and T2D. Using samples from seven European populations, the Oxford team further validated high-sensitivity CRP (hsCRP) as a clinical biomarker for the diagnosis of diabetes subtypes NHF1A MODY. It is expected that hsCRP can be rapidly applied in clinical practise to improve diagnosis of this diabetes subtype, considering its relatively low cost and wide availability (Owen et al, 2010; Thanabalasingham et al, 2011). In more recent work the same team has identified plasma glycan profiles as an additional biomarker of HNF1A-MODY.

Several ENGAGE studies also contributed to gain knowledge in potential development of clinical diagnostics for prediction and prognostication. These efforts include studies such as identifying biomarkers to predict diabetic complications (through collaboration with IMI-Summit); testing a number of 46 T2D related SNPs for their ability to predict future T2D (through a joint meta-analysis); and evaluating a set of biomarkers (including troponin 1, adrenomedullin and vitamin D) for their ability to predict cardiovascular outcomes (through the MORGAM biomarker project). Also by applying a two-stage population risk screening strategy using perspective population cohorts, the Helsinki team has evaluated genetic risk discrimination and reclassification for coronary heart disease (CHD). The findings showed that a genetic risk score based upon 28 previously reported risk variants for CHD improves risk prediction of CHD and thereby may be used to help identify individuals at high risk for the first CHD event (Tikkanen et al, ATVB accepted).

ENGAGE partners and collaborators have demonstrated the strength of combining genomics and metabolomics information in a global evaluation of genetic variance in human metabolism. The study successfully identified 37 genetic loci associated with blood metabolites; with majority of the loci showing exceptionally large effect sizes (Suhre et al, 2011). Through association analyses of known disease-risk loci with these metabolites, the findings have provided new insights in biological pathways, offering potential possibilities for diagnosis and treatment. For example, in relevance to diabetes, one of the findings identified a strong association between a variant in the glucokinase regulatory protein (GCKR) gene and the metabolites mannose:glucose ratio; and that the association was found stronger between the GCKR and mannose, in comparison to GCKR and glucose. This may pave the way for further studies on the role of mannose as a potential biomarker for abnormal glucose metabolism.

This study also contributed new information to pharmacogenomics. Six out of the 37 genetic loci identified in this study were previously known to be associated with drug toxicity or adverse reactions. The associations with metabolites and these loci established in this study may provide new biochemical information that can be useful for drug design to avoid adverse reaction.

Although some ENGAGE findings have been initially translated into the clinical arena, it is expected that it will take some years for the full clinical impact of ENGAGE discoveries to reach fruition. We are interacting closely with related European Union (EU) efforts (for example Summit, Direct, CEED3, BiomarCaRE) to support programmes that will outlive ENGAGE.

Explore key methodological questions relevant to European research in this area and develop novel technological and statistical approaches for the study of human disease

The exploration of key methodological questions and further development of new technological and statistical approaches relevant to genetic and genomic epidemiology has been an intrinsic component of many ENGAGE key initiatives, including the GWAS integration analyses, genetic as well phenotypic refinement activities and flagship projects.

Notably, many of our GWA meta-analyses have involved developing strategies for dealing with structure, quality control, cryptic relatedness, heterogeneity etc. These efforts include (but are not limited to):

1. Imputation based on reference data sets (e.g. HapMap and 1000 Genomes) to support combination of data generated on different platforms and to extend analysis to other non-typed variants, using programs such as IMPUT and MACH;
2. Data quality control including correction for stratification using genomic control and principal component analysis (PCA) based method;
3. Strategies for harmonisation of diverse phenotypic transformations and analytical protocols within individual genome-wide scans;
4. Strategies for efficient replication on findings and for dealing with heterogeneity.

Also through development of the SIMBioMS platform, ENGAGE also developed information technology (IT) tools to facilitate efficient integration of summary GWAS data (with potential to extend to other data types) and associated meta-data.

To share the established methodological expertise and to ensure that the integration analyses as well as data organisation were conducted in a standardised manner within the ENGAGE consortium, a two-day training course for pre- and post-doctoral researchers were organised in February 2010 to provide them hands-on training covering a variety of computational methods for GWA integration studies (including these key methodological topics listed above). In addition, a separate statistical workshop was held to tackle existing and emerging statistical modelling challenges of genome-wide studies with special focus on analysis of rare variants and interactions.

Other key methodological exploration work within ENGAGE is exemplified by the following list:

1. Strategies to identify low frequency and rare variants, e.g. for finding missing heritability;
2. Development of imputations and phasing tools allowing for analysis of millions of SNPs not directly genotyped for identification of new signals and refinement of signals;
3. Strategies to evaluate biology, joint effects and pleiotropy of discovered variants in large datasets (e.g. through Lipidaction, Large-Scale Epidemiology Flagship project as well as through collaborations with other consortia;
4. Exploration and development of the potential of next generation sequencing (NGS) technologies for transcriptome analyses in human cohort studies and genome-wide expression quantitative trait loci (eQTL) analysis (through the ENGAGE DeepSAGE Transcriptomics Flagship project);
5. Development of algorithms for copy number variation (CNV) detection on NGS data, at both whole-genome and whole-exome based datatset;
6. Strategies specific for CNV analysis combining samples across different cohorts to search for CNV and metabolic associations;
7. Strategies for analysis of MeCap sequence data as well as protocols for genome-wide methylation analysis (through the ENGAGE Epigenomics Flagship project);
8. Development of new methods for data mining and for identification of candidate genes and disease relevance information including pathways, by exploring available internal and external genetic and genomic data as well as information from the literature;
9. Strategies for the analysis of longitudinal and multivariate phenotypes;
10. Exploration and development of statistical methods to combine genomics and other 'omics' (e.g. metabolomics) dataset for association studies (e.g. through the Metabolomics Flagship project).

Contribute to international efforts in large population cohorts as exemplified by our very close contacts with the P3G effort (Public Population Projects in Genomics)

Other than contributing data and analytical expertise to international trait-specific GWAS consortia through active collaboration, ENGAGE has been well connected with other major European and international projects relevant to large population cohorts. These include the EU-funded projects: 'Promoting Harmonisation of Epidemiological Biobanks in Europe' (PHOEBE), Harmonisation of phenotyping and Biosampling for Human Large-Scale Research Biobanks' (BioSHaRE-EU), 'Genotype to Phenotype Databases: a Holistic Approach' project (GEN2PHEN), the 'Biobanking and Biomolecular Resources Research Infrastructure' (BBMRI), the European Life Sciences Infrastructure for Biological Information (Elixir), 'BBMRI Large Perspective Cohorts' (BBMRI-LPC) and the Public Population Projects in Genomics (P3G) in Canada.

In particular, ENGAGE and P3G have several joint-force activities such as:

1. Tool building to promote interoperability of large-scale international initiatives in the field, such as code of conducts for data sharing, bridging content for clinical and non-clinical environment, recommendation for recruitment by genotype etc.
2. Establishment of ENGAGE Study and Access Catalogue in P3G Observatory (see
3. Organisation of two Summer Institutes, the first International Biobank Summit
4. Organisation of a joint workshop to explore the potential use of Datashaper, a harmonisation tool developed by P3G, in phenotypes relevant to metabolic syndrome.

As a result from the ENGAGE - P3G joint workshop, a strategic collaboration towards a more streamlined and standardised procedure for harmonisation across cohorts was established. One of the key outputs from these harmonisation efforts was the IT platform for phenotype harmonisation 'Sample availability system' (SAIL) developed by our integration and database teams. SAIL has been used in ENGAGE and other related EU projects (e.g. Summit).

Potential Impact:

Europe has a strong track record as a pioneer in the field of human genetic and genomic epidemiology and remains the international leader in many respects. As the technologies rapidly advance and become more affordable, unified health care systems together with rich data and sample collections from populations as well as world-renowned molecular epidemiologists reflect unique opportunities for Europe to accelerate the translation of basic research findings into clinical applications. ENGAGE has used the special competitive niche of Europe to translate the wealth of data emerging from large-scale research efforts in genetic and genomic epidemiology conducted in well-characterised European (and other) samples into information of relevance to future clinical advance.

ENGAGE has played a leading role in the integration of genetic data from diverse European data sets and has catalysed wider global efforts for several major traits. This leadership is already manifested in the publications arising from the project and in the high international profile of the consortium. This research has mostly focused on GWA data because of its availability and relative ease of integration and has been supported by considerable 'behind-the-scenes' activity with respect to informatics, data access, trait harmonisation, statistical methodologies and ethical compliance. ENGAGE has continued using this infrastructure to power further rounds of discovery beyond ENGAGE, that encompass a wider range of medical phenotypes (including a suite of behavioural and psychiatric traits), genomic traits (telomere length, metabolomics) and genetic variation (rarer variants, copy number variants).

The first step in translation of these genetic discoveries is to define the molecular mechanisms through which they impact disease. ENGAGE made efforts to develop strategies for refining both the genetic and phenotypic basis of these associations. The first of these has involved deployment of fine-mapping, resequencing and imputation approaches, the objective being to track the specific causal alleles, a challenging task given the strong correlations that exist between nearby variants. The phenotypic efforts has focused on exploration of the wider biological consequences of associated variants and on epidemiological studies to define the ways in which genetic variants interact with each other and with environmental exposures. These efforts have become, at least in part, focused around their respective resequencing and epidemiology 'flagship' projects. The identification and characterisation of these alleles (especially where coding) will catalyse efforts to characterise biological mechanisms conferring risk and protection that will continue beyond the lifetime of ENGAGE. On-going efforts within the large-scale epidemiology flagship projects are expected to deliver further biological insights, including, the causal relationship between obesity and other phenotype groups (such as cancer, neuropsychiatric outcomes) and complex pleiotropic relationships involving variants influencing the incretin axis.

Clinical translation represents the ultimate objective of human genetic discovery, but will require many years to play out. ENGAGE has contributed to some modest successes in this area and will continue to support the various efforts towards stratified medicine described earlier. To ensure the continuation of these efforts post-ENGAGE, ENGAGE is interacting closely with related EU-efforts (e.g. Summit, Direct) which have shared goals and some overlapping membership.

The high profile research results generated by ENGAGE have helped to maintain the competitiveness of European research excellence in the field of human genetic and genomic epidemiological research and will have an impact of driving research advancement in the scientific community involved in the related fields.

Through various efforts from discovery and biology to translation, ENGAGE has provided the research community and industry, including research oriented small and medium sized enterprises (SMEs) with advanced knowledge and tools for further preclinical/clinical exploitation that can ultimately lead to development of new biomarkers and companion diagnostics as well as identification of new targets for further drug discovery and development. The results (e.g. new biological mechanism) may also have potentials to provide new targets for interventions as prevention or treatment in clinical medicine.

ENGAGE results will also have a societal impact in multiple dimensions. These include updated ELSI guidelines and codes of conducts for moving genomic knowledge to the clinical research setting, as the molecular elucidation of pathways to disease is moving beyond a single focus on gene identification to other sources including diverse types of genetic variation, upstream phenotypes revealed through 'omic' profile and gene-environment interactions that include behaviour traits. In relevance to health care system and welfare, the knowledge generated by ENGAGE, including genetic susceptibility and understanding the relationship between genetic and life-style/environmental risk factors, has good potential to lead to advances in clinical practice that will impact also at welfare level by influencing public health and policy decision.

ENGAGE has demonstrated a high level of integration and dynamic coordination and communication within a relatively large collaborative research consortium. The experience and knowledge obtained during the course of ENGAGE have been hugely beneficial for the field in general and have also contributed to the training and development of a cadre of junior researchers with the skills and scientific temperament to support future projects of this kind. ENGAGE is making best efforts to document and disseminate such experience and provide recommendations for other consortia and funding decision bodies.

Main dissemination activities and exploitation of results

Dissemination activities

ENGAGE scientists have actively disseminated research results to both scientific and non-specialist audience. This has been exemplified by the extensive list of dissemination activities. The ENGAGE project website (see is served as central tool to widely disseminate ENGAGE main research outputs and activities to all audience.

Research community

ENGAGE scientists have made best efforts to disseminate research results to wider scientific audience through publishing papers on scientific journals, presenting information and project results at international conferences, meetings, workshops and courses (both external or co-organised by ENGAGE through the 'ENGAGE Training Policy').

So far, ENGAGE partners have published more than 250 scientific papers relating to its project activities; many of these are in high profile journals (such as Nature, Nature Genetics, Lancet) that have attracted wider dissemination activities. A list of publications is maintained under the ENGAGE public web site and is regularly updated.

An access catalogue has been developed to inform scientists outside of ENGAGE regarding research procedures for access to ENGAGE resources for genetic and genomic epidemiology research. The catalogue is available under the ENGAGE project web site (see Additionally, in collaboration with P3G, an ENGAGE Consortium Catalogue, served as a repository of standard information describing ENGAGE cohorts, is available in the P3G Observatory (see

ENGAGE scientists have disseminated research data for access by wider scientific community. For example, summary level data of GWAS meta-analyses for major phenotypes are deposited and released through collaborative trait-specific consortia on the respective websites:

1. The Meta-Analyses of Glucose and Insulin-related traits Consortium (Magic):
2. Diabetes Genetics Replication and Meta-analysis Consortium (Diagram)
3. The Genetic Investigation of Anthropometric Traits Consortium (Giant)
4. Early Growth Genetics Consortium (EGG)
5. Global Lipids Genetics Consortium
6. Human metabolic individuality (Suhre et al, Nature 2011)

Furthermore, other type of data generated through ENGAGE activities have been (or are in the process of being) deposited in public data archives such as the European Genome-phenome Archive (EGA), ArrayExpress, PRIDE. These will be made accessible to bona-fide researchers outside of the consortium through a constituted Data Access Committee (DAC).

The ENGAGE Coordination Office has published an ENGAGE Newsletter distributed bi-monthly to all ENGAGE members to inform about activities in the project such as flagship projects, publications and training opportunities. Part of the Newsletter content that is of wider audience' interest, such as recently published high impact papers, ENGAGE's training opportunities (e.g. Summer Institute, open workshops etc), has been published on the News bulletin of the project website by means of RSS feed.

Clinical and public health community

Clinicians are one of the key stakeholders for ENGAGE research results. Several ENGAGE scientists are involved in public health and publish scientific articles in this area. ENGAGE also (co)-organised workshops and seminars that were open to clinicians and the public health community. For example, the two ENGAGE Summer Institutes 'Genetics, Ethics and Clinical Translation' and 'Translational Genomics Pipeline: From Populations to Individuals' covered key clinical and public health components in the course programmes. In addition ENGAGE team on Ethics and Society worked on topics related to facilitating research projects involving data sharing clinical and non-clinical researchers.

General public

The ENGAGE project web site has been used as a channel to disseminate basic concept of ENGAGE that might be of interests to general public (see

Through local dissemination efforts in partner institutions, ENGAGE partners also disseminated their relevant research results to more layman audience, such as interviews by media, public lectures.

ENGAGE also explored the possibility to use audio-visual tools to introduce project concept and main results to the wider audience. An ENGAGE video clip was made in collaboration with a professional film producer FastFacts targeting to attract attention from general public. The video product has been widely distributed through web-based video sharing channels, such as Youtube, Vimeo as well as on the ENGAGE project website.

Link to ENGAGE video:

Exploitation of results

As an integrated research project aiming to advance knowledge in human disease genetics and explore relevant translational opportunities in medicine, the research community is identified as one of the primary users to take up ENGAGE research results for further exploration before the outputs can be fully translated into the clinics. Such exploitable results include analytical data generated, methods and tools developed, laboratory protocols and technology platforms established as well as databases built. The main exploitation measures taken by ENGAGE for such results targeting scientific community have been through publication of research papers, presentations at international scientific conferences, training proposals through organisation of open courses/workshops, consultation through research collaboration and website resources.

It is important to make clear that the majority of ENGAGE research takes place in the 'precompetitive space' and adheres to the norms of the field with respect to not claiming IP positions on genetic discoveries. Rather we make them available as rapidly and widely as possible for others to develop further the scientific picture and to build the case for translational use.

ENGAGE research is built on the invaluable resources from large-scale cohort collections. Some of the results therefore are highly relevant to the biobanking community and can also be used by this community. Specifically, the various databases developed can be used for tracking meta-consortium activities (Emanta), for data storage, exchange and annotation (AIMS) and for inventories of available specimens as well as for harmonisation and indexing across several collections in biobanks (SAIL). Some of these database tools have been used by institutions involved in biobanking activities as well as international projects.

Also the ENGAGE ELSI tools built to improve interoperability can be exploited by the biobanking community. These include code of conduct for data sharing, recommendations for bridging consent in clinical and non-clinical environment, for retrospective access to data and for recruitment by genotype.

Technology platforms improved for ENGAGE targeted or large-scale sample analyses, such as next generation sequencing (e.g. whole-genome, exome based), genotyping with custom arrays (e.g. metabochip), genome-wide DNA methylation analysis are results that can be further exploited for commercial utilisation (in research or clinical diagnostics). The exploitation of such results has been primarily through our industrial partners, Illumina and deCODE, who have been actively involved in several of these ENGAGE activities.

In addition to technology platforms, biomarkers as predictors of disease onset or progression would be potential targets for commercial exploitation. For example, the identification of hs-CRP as a non-genetic biomarker for diabetes subtypes can be one immediate target for clinical use. However, even though significant progress has been made to explore translational opportunities, ENGAGE findings at this stage still represent early innovations and direct exploitation for diagnostic inventions is relatively modest. ENGAGE has established strategic collaboration with several EC funded projects (e.g. IMI-Summit, Direct) to support programmes that will carry on our efforts towards translation.

List of Websites:



Contact details:

Professor Mark McCarthy, Robert Turner Professor of Diabetes

The Oxford Centre for Diabetes, Endocrinology and Metabolism (OCDEM)

University of Oxford, Churchill Hospital

Headington, OX3 7LJ, UK

Professor Jaakko Kaprio, Professor of Genetic Epidemiology

Institute for Molecular Medicine Finland (FIMM)

Nordic EMBL Partnership for Molecular Medicine

University of Helsinki

Biomedicum Helsinki 2U

8, Tukholmankatu 8, PO Box 20

University of Helsinki

FI-00014, Finland