Skip to main content

Human Exposome Assessment Platform

Periodic Reporting for period 1 - HEAP (Human Exposome Assessment Platform)

Reporting period: 2020-01-01 to 2021-06-30

The exposome risk factors behind the increment of age-adjusted incidence of chronic diseases such as cancer is not fully determined. HEAP is developing an integrated research framework including a technical platform and research methods for exposome risk factors assessment. It will make possible a distributed and standardized management, analyses and knowledge discovery from heterogeneous source of data from clinics, research, wearable sensors, biobanks and environmental data. The HEAP technical platform is built in a generic way with the vision of creating a system that could be deployed globally. HEAP is synthetizing molecular biology, big data, AI, advanced statistics, IoT, exposure sensors, ICT resources, in one integrated platform. The platform will enable the standardized and consistent exposome assessment toward worldwide collaboration to improve health and society. During the project life-time HEAP will demonstrate its capacities by generating knowledge from big cohorts. It includes one clinical study that combines maternity clinical data and genomic data with exposome data collected by wearable sensors to create personal health profiling of patients. HEAP envisions that the platform can be used in many different use cases in Europe and worldwide.
During this reporting period HEAP has achieved the results:
Coordinated the first phase of EHEN
Formalized the different datasets from the selected cohorts
Improved the wearable sensor prototype to be manufactured and used in the pilot study
Implemented the data management software platform (Hopsworks) based on the DMP
Integrated the ICT components (computational resources and software platform)
Created the first version of the ethical and regulatory framework
Fine-tuned synergies between work packages to carry out the planned tasks
Consolidated the dissemination and education framework
Created synergies with EHEN
For assessment of environmental exposures, a new Personal Exposure Monitor (PEM) has been designed and tested. The new model excels in all the capabilities of the previous model, as already demonstrated with very interesting results. The device will be used in the HEAP pilot with pregnant women and will settle the bases for personal health profiling based on wearable PEMs. The Hopsworks’ Feature Store has been implemented to allow streaming data with different frequencies of ingestion including real-time data, which enables integration of data from sensors such as PEMs into a machine learning analysis. Combining the external exposures and internal biomarkers provide a new avenue for early detection and prediction of diseases, leading towards a more personalized and cost-effective healthcare.

The MLC Foundation (MLCF) has developed standards for anonymization that will be implemented in HEAP. In parallel a legal framework around the consumer data collection platform developed in WP4 has been finalized and the Consumer purchase data solution has been launched. Regarding the consumer cohort, as a response to the Covid-19 pandemic, Consumer purchase data may be used to analyse changes in relation to Covid-19 infection as part of a Danish initiative to study late effects of Covid-19. Legal, ethical and data governance aspects of the innovative informatics platform being developed in HEAP have been discussed and proposals for developing this governance have been refined. The contribution of MLCF to the HEAP project and to the EHEN opens new opportunities for MLCF to explore and generate new knowledge regarding ethics and regulatory issues related to exposome research, which can be reverted in collaborations with other projects or enterprises working with exposome-related projects.
Selected pipelines for metagenomics analysis have been evaluated and compared in terms of quality of results and performance (analysis time) and are being used for generating data for the machine learning analyses. The deep learning algorithms generated will improve the metagenomics pipelines. A collaboration with the University of Eastern Finland, for implementing Machine Learning algorithms for finding relationships between cancer types and metagenomics profiling using the TCGA database and pipelines developed in HEAP that will be deployed in customised Hopsworks, has been initiated.

A secure and standardised IaaS for the first version of the HEAP Information Commons has been developed, which enables the test of storing and sharing heterogeneous data from the cohorts. The HEAP software platform provides a flexible framework for deep learning, which is a great advantage of the system that will be demonstrated in the coming phases of the project. HEAP software platform (Hopsworks) makes the implementation of machine learning easy, which can be very attractive to other projects in EHEN once the application of ML to the analysis of heterogeneous data sources is demonstrated.
In summary, progress of the HEAP software platform integrated with the Information Commons (IaaS) anticipates the possibility of implementing HEAP instances in various infrastructures and integrating and managing sensitive data in a secure way. At the same time, the generic concept applied to this platform allows creating new analysis pipelines and reusing analysis tools created by other researchers, institutions, and projects.
The HEAP data life cycle