European Commission logo
español español
CORDIS - Resultados de investigaciones de la UE
CORDIS

Human Exposome Assessment Platform

Periodic Reporting for period 3 - HEAP (Human Exposome Assessment Platform)

Período documentado: 2023-01-01 hasta 2023-12-31

The exposome risk factors behind the increment of age-adjusted incidence of chronic diseases such as cancer is not fully determined. HEAP is developing an integrated research framework including a technical platform and research methods for exposome risk factors assessment. It will make possible a distributed and standardized management, analyses, and knowledge discovery from heterogeneous source of data from clinics, research, wearable sensors, biobanks, and environmental data. The HEAP technical platform is built in a generic way with the vision of creating a system that could be deployed globally. HEAP is synthetizing molecular biology, big data, AI, advanced statistics, IoT, exposure sensors, ICT resources, in one integrated platform. The HEAP platform enables the standardized and consistent exposome assessment toward worldwide collaboration to improve health and society. During the project life-time HEAP demonstrates its capacities by generating knowledge from different cohorts. It includes one clinical study that combines maternity clinical data and genomic data with exposome data collected by wearable sensors to create personal health profiling of patients. Besides other established cohorts (cervical cancer, HPV vaccination, Maternity, lifestyle), HEAP includes a consumer cohort that can demonstrate how consumer data can be a tool for lifestyle assessment. HEAP envisions that the platform can be used in many different use cases in Europe and worldwide.
During this reporting period, HEAP has focused on excelling in the results taking into consideration that there is one year to complete the project. HEAP made the following main progress:
1. Expanding and improving HEAP cohorts data and producing results and publications.
2. Final stage of the wearable sensor pilot study.
3. A new version of the HEAP software platform (Hopsworks) with relevant improvements in functionalities and UI based on HEAP research requirements is deployed at CSC.
4. Population of Information Commons with data from cohorts and test of analysis pipelines.
5. Ethical and regulatory framework produced the first HEAP governance document.
6. WPs collaborations to share analysis pipelines and data.
7. Relevant work in the dissemination and education framework for HEAP and EHEN.
8. HEAP - EHEN collaboration in Ethics and Dissemination.
9. Work towards HEAP impact in exposome research and sustainability.
For assessment of environmental exposures, a new Personal Exposure Monitor (PEM) was designed, tested, produced. The new model exceled in all the capabilities of the previous model, as already demonstrated with very interesting results. The device is used in the HEAP pilot study with pregnant women and will settle the bases for personal health profiling based on wearable PEMs. The Hopsworks’ Feature Store allows streaming data with different frequencies of ingestion including real-time data, which enables integration of data from sensors such as PEMs into a machine learning analysis. Combining the external exposures and internal biomarkers provide a new avenue for early detection and prediction of diseases, leading towards a more personalized and cost-effective healthcare.
The MLC Foundation (MLCF) has developed standards for anonymization being implemented in HEAP. In parallel a legal framework around the consumer data collection platform developed in WP4 was finalized and the Consumer purchase data solution has been launched. Regarding the consumer cohort, as a response to the Covid-19 pandemic, Consumer purchase data may be used to analyse changes in relation to Covid-19 infection as part of a Danish initiative to study late effects of Covid-19. Legal, ethical and data governance aspects of the innovative HEAP informatics platform have been discussed and proposals for developing this governance have been refined. The contribution of MLCF to the HEAP project and to the EHEN opens new opportunities for MLCF to explore and generate new knowledge regarding ethics and regulatory issues related to exposome research, which can be reverted in collaborations with other projects or enterprises working with exposome-related projects.
Metagenomics analysis pipelines developed in HEAP are generating data for the machine learning analyses, aiming to improve classifications and predictions based on metagenomics data. A machine learning algorithm for classification of HPV infection, is being improved to find relationships between cancer types and metagenomics profiling (WP9). The metagenomics pipelines and the machine learning model are deployed and tested in the testbed of HEAP Hopsworks platform at CSC. A set of analysis tools and pipelines from the HEAP cohorts are defined and formalized for deployment into the platform, which now provides support for multi-tenant RStudio.
A secure and standardised IaaS for the first version of the HEAP Information Commons was developed, enabling the storing and sharing of heterogeneous data from the cohorts. The HEAP software platform provides a flexible framework for deep learning, which is a great advantage of the system that will be demonstrated in the coming phases of the project. HEAP software platform (Hopsworks) makes the implementation of machine learning easy, which can be very attractive to other projects in EHEN once the application of ML to the analysis of heterogeneous data sources is demonstrated.
The HEAP Hopsworks is a horizontally scalable Data Science and Analytics platform that can storage and manage massive amount of big sensitive data, including unstructured data such as sequences, IoT, images, etc., and structured data such as electronic health data. Researcher can implement and deploy their own analysis tools and pipelines, install existing ones, and create machine learning models to make predictions among heterogeneous datasets managed by the platform. The HEAP Hopsworks platform is integrated with the Information Commons (IaaS) provided by CSC, and in a computer cluster at KI as proof of concept for reproducibility.
The HEAP data life cycle