Skip to main content
European Commission logo
español español
CORDIS - Resultados de investigaciones de la UE
CORDIS

Scalable Learning for Reproducibility in High-Dimensional Biomedical Signal Processing: A Robust Data Science Framework

Objetivo

Data science has quickly expanded the boundaries of signal processing and statistical learning beyond their accustomed domains. Powerful and complex machine learning architectures have evolved to distinguish relevant information from randomness, artifacts and irrelevant data. However, existing learning frameworks lack computationally scalable, tractable, and robust methods for high-dimensional data. Consequently, discoveries, for example, in genomic data can be the result of coincidental findings that happen to reach statistical significance. As long as groundbreaking advances in biotechnology are not accompanied by appropriate learning frameworks, valuable efforts are spent on researching false positives. ScReeningData develops a coherent fast and scalable learning framework that jointly addresses the fundamental challenges of drastically reducing computational complexity, providing statistical and robustness guarantees, and quantifying reproducibility in large-scale and high-dimensional settings. An unprecedented approach is developed that builds upon very recent work of the PI. The underlying concept is to repeat randomized controlled experiments that use computer-generated fake variables as negative controls to trigger an early stopping of the learning algorithms, thereby mitigating the so-called curse of dimensionality. In contrast to existing methods, the proposed methods are completely tractable and scalable to ultra-high dimensions. The gains of developing advanced robust learning methods that are computed ultra-fast and with tight guarantees on the targeted rate of false positives are enormous. They lead to new reproducible discoveries that can be made with high statistical power. Due to the fundamental nature and the broad applicability of the proposed learning methods, the impacts of this project extend far beyond the considered biomedical signal processing use-cases, benefitting all scientific domains that analyze high-dimensional data.

Institución de acogida

TECHNISCHE UNIVERSITAT DARMSTADT
Aportación neta de la UEn
€ 1 500 000,00
Dirección
KAROLINENPLATZ 5
64289 Darmstadt
Alemania

Ver en el mapa

Región
Hessen Darmstadt Darmstadt, Kreisfreie Stadt
Tipo de actividad
Higher or Secondary Education Establishments
Enlaces
Coste total
€ 1 500 000,00

Beneficiarios (1)