Periodic Reporting for period 2 - DynaOmics (From longitudinal proteomics to dynamic individualized diagnostics)
Reporting period: 2017-12-01 to 2019-05-31
Longitudinal proteomics data provides an opportunity for individualized treatment and improved biomarker detection. While the technology is developing, longitudinal proteomic datasets are still scarce and computational tools to analyse them are lacking. This project aims to address these issues by developing effective computational tools for detecting protein markers using longitudinal data and by building dynamic individualized predictive models. A biomedical focus is on Type 1 diabetes, where early detection of the disease before clinical symptoms is crucial for developing future therapeutic and preventive strategies.
Work performed from the beginning of the project to the end of the period covered by the report and main results achieved so far
To characterize longitudinal protein features and their dynamics, we have tested the state-of-the-art methods for longitudinal omics data as well as developed novel approaches that take into account the interplay between multiple proteins. For assessment of the methods in well-defined samples, surrogate longitudinal data were generated using mass spectrometry-based shotgun proteomics. To ensure high-quality quantitative data for modelling, we have performed a comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation (Välikangas et al. 2017) as well as normalization (Välikangas et al. 2017), and implemented an R/ Bioconductor package for normalization of phosphoproteomics data (Saraei et al. 2017). Towards optimized marker detection, we have implemented an R/Bioconductor package for reproducibility-optimized statistical testing ROTS (Suomi et al. 2017) and its enhanced version for the increasingly popular data-independent acquisition (DIA) mass spectrometry technology (Suomi & Elo 2017). For individualized dynamic predictive modelling of longitudinal proteomics data, we have compared currently available dynamic predictive models using longitudinal clinical data. Further investigation of the models is ongoing as well as investigation of alternative modelling techniques.
Progress beyond the state of the art and expected potential impact (including the socio-economic impact and the wider societal implications of the project so far)
DynaOmics enables new insights into modelling high-dimensional longitudinal datasets with focus on clinical proteomics. This has high potential to open new avenues for diagnosis and treatment of complex diseases and novel insights towards precision medicine. We introduce novel types of dynamic markers that are undetectable in conventional cross-sectional studies, develop methods for their robust detection and models for individualized disease risk prediction dynamically. Unconventional approaches and new methods are applied that have not been previously used in this context, including statistical and machine learning methods, such as joint models of longitudinal and time-to-event data and one-class classification type techniques. In the context of Type 1 diabetes, DynaOmics is anticipated to assist early detection of the disease beyond currently used tools, which is crucial for developing future preventive and therapeutic strategies. Overall, the computational methods developed in the research will be beneficial in a wide range of applications and diseases.