Skip to main content
European Commission logo
English English
CORDIS - EU research results
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary
Content archived on 2024-05-28

Advanced Data-Driven Black-box modelling

Final Report Summary - A-DATADRIVE-B (Advanced Data-Driven Black-box modelling)

In many systems data are becoming abundantly available and predictive models are increasingly used for cost savings, efficiency, health, safety and organizational purposes. In several cases white box models, for which the model parameters have a physical meaning, are difficult to obtain. Therefore black-box modelling approaches are often considered as a suitable alternative.

Within this ERC Advanced Grant A-DATADRIVE-B the aim is to study more advanced black-box modelling techniques for estimating predictive models from measured data. At this point different optimization modelling frameworks have been explored, both related to parametric and kernel-based models. In connection to this broad class of models the research activities have been centralized around the themes of prior knowledge incorporation, kernels and tensors, modelling structured dynamical systems, sparsity, optimization algorithms, the choice of core models and mathematical foundations, and software.

In this project we have contributed to establishing a systematic and generic methodology for advanced black-box modelling, to achieving an integrative understanding of the modelling aspects with respect to different areas, and to bringing advanced black-box modelling techniques closer to the end-user.

More specifically, main realizations include that

- a more powerful optimization modelling framework has been achieved for kernel-based models and support vector machines, with primal and dual model representations. It enables to extend the use of positive definite kernels to tensor kernels. For least squares support vector machines and kernel principal component analysis either positive definite or indefinite kernels can be employed. A new variational principle for matrix singular value decomposition is proposed with extension to non-symmetric kernels. Related to kernel probability mass function estimation the notion of kernel trick can be replaced by a positive operator valued measure for quantum measurement. Finally also a new framework of deep restricted kernel machines is proposed based on a principle of conjugate feature duality. In this way models can be represented in terms of visible and hidden units. It enables to either train deep feedforward neural networks in primal form or kernel-based in their dual representation.

- for kernel spectral clustering, classification and regression, new methods for incorporation of prior knowledge have been proposed. Optimized fixed-size kernel methods are developed in supervised, unsupervised and semi-supervised learning for clustering, classification and regression. Multilevel hierarchical kernel spectral clustering has been proposed for large scale complex networks, together with multi-view models. Efficient methods for incremental kernel spectral clustering are studied for on-line learning of non-stationary data. Scalable methods have been established that are applicable also to big data.

- for robust and sparse learning, different regularization schemes (L0, L1, L2, elastic net, two-level L1, nuclear norm, combined sparse and low rank penalties, ...) and loss functions (re-weighted least squares, pinball loss, ramp loss, correntropy, asymmetric loss, ...) have been studied. This within a wide context of parametric models, matrix-, tensor- and kernel-based models.

- the methods have been developed to handle both high-dimensional and large data sets. Additional application studies have been made in black-box weather forecasting, pollution modelling, alarm prediction and predictive maintenance in industrial machines, load forecasting, structural health monitoring, process anomaly detection, and others.

Organized workshops within the framework of the project are ROKS 2013 (International workshop on advances in Regularization, Optimization, Kernel Methods and Support Vector Machines: theory and applications) and TCMM 2014 (International Workshop on Technical Computing for Machine Learning and Mathematical Engineering). These were intended as a multi-disciplinary forum where researchers of different communities can meet.

The results of this project have been realized by an interdisciplinary and internationally oriented team of researchers with complementary backgrounds. More information on research publications, software, and presentations is available at the project website http://www.esat.kuleuven.be/stadius/ADB.