Skip to main content

From longitudinal proteomics to dynamic individualized diagnostics

Periodic Reporting for period 4 - DynaOmics (From longitudinal proteomics to dynamic individualized diagnostics)

Reporting period: 2020-12-01 to 2022-05-31

Many diseases develop over time and at the time of diagnosis, it may already be late. For developing future therapeutic and preventive strategies, it is important to detect markers that can indicate the disease as early as possible. The ERC Starting Grant project DynaOmics has provided new innovative methods and tools for the utilization of longitudinal proteomics data for early prediction of disease progression, providing new opportunities for individualized treatment decisions and improved biomarker detection. While the mass spectrometry technologies have been developing rapidly, longitudinal proteomic datasets have still remained scarce and computational tools to analyse them have been lacking. The aim of DynaOmics was to address these issues by developing effective computational tools for detecting protein markers using longitudinal data and by building dynamic individualized predictive models. A biomedical focus was on type 1 diabetes, where early detection of the disease before clinical symptoms is crucial for developing future therapeutic and preventive strategies.
For optimized marker detection, we have developed a novel biomarker detector for longitudinal proteomics data, enabling robust and reproducible detection of the markers. To characterize longitudinal protein features and their dynamics, we have tested state-of-the-art methods for longitudinal omics data as well as developed novel approaches that take into account the interplay between multiple proteins. To ensure high-quality quantitative data for modelling, we have performed a comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification, imputation, and normalization. To assess the methods in well-defined samples, surrogate longitudinal data were generated using mass spectrometry-based shotgun proteomics.

To develop innovative strategies for individualized disease risk prediction dynamically, we have introduced new statistical and machine learning techniques for longitudinal data. These include new methods for binary stratification of the individuals over time as well as time-to-event prediction. Additionally, we have introduced a robust feature selection method that allows significantly reducing the number of proteins needed for the prediction without reducing the prediction accuracy. The methods have been carefully validated computationally in multiple real and simulated datasets. Further experimental validations have been performed to support selected key findings.

Finally, the developed computational methods have been applied to identify novel candidate markers and models for predicting early type 1 diabetes and its progression. Early detection of the disease already before clinical symptoms is crucial for developing future therapeutic and preventive strategies. In addition to proteome-level data, also other molecular omics layers have been considered.
DynaOmics has enabled new insights into modelling high-dimensional longitudinal datasets with focus on clinical proteomics. This has high potential to open new avenues for diagnosis and treatment of complex diseases and novel insights towards precision medicine. We have introduced novel types of dynamic markers that are undetectable in conventional cross-sectional studies and developed methods for their robust detection and for individualized disease risk prediction dynamically. In the context of type 1 diabetes, the findings from DynaOmics are anticipated to assist early detection of the disease beyond currently used tools, which is crucial for developing future preventive and therapeutic strategies. Overall, the computational methods developed in the research have been made publicly available and they are expected to be beneficial in a wide range of applications and diseases.