## Final Activity Report Summary - SPARSE (Sparse Approximations for Blind Source Separation)

It is already clear that we have truly entered in the era of information. During the last decades, experiments in particle physics produced a very large amount of data. In the history of modern medicine, diagnostics accumulated encyclopaedias of medical cases. Genome sequencing, the process of determining the exact order of the three billion chemical building blocks, called bases, which make up the deoxyribonucleic acid (DNA), was the greatest technical challenge in the human genome project. The resulting DNA sequence maps are being used to explore human biology. The internet is a universe of information rapidly changing and growing at an exponential rate.

A new area of science and engineering is now strongly needed in order to extract significant information from this universe of data and to understand it. This is one of the prevailing challenges of this century, i.e. the capability to organise and understand complexity. This project significantly contributed to mathematical developments in the field of data analysis. The traditional tools utilised in this sector, for instance statistics, e.g. principal and independent component analysis, and Fourier analysis, soon revealed their limits in dealing with high dimensionality, low scale and non-stationary phenomena and nonlinearity. An interdisciplinary approach beyond tradition seemed therefore to be more promising. The advent of sophisticated analysis tools from numerical harmonic analysis, e.g. wavelets and time-frequency analysis and decompositions, opened new frontiers in data analysis. Moreover, new emerging ideas were flourishing from the combination with different mathematical fields, such as variational calculus, approximation theory, optimisation and probability. In this framework we distinguished two different conceptual situations which in turn corresponded to different applications in data analysis:

1. sparse recovery, with the data that were in disposal originating from measurements of the quantity of interest which we wanted to recover

2. learning, with the data being direct realisations of a statistical distribution which was unknown. The data were examples from which we learned in order to make precise predictions.

Both these blind source problems were approached by minimising the data misfit and imposing further regularising constraints on the solution, e.g. minimal norm in a suitable Hilbert space.

The most remarkable and recent advances in data analysis were based on the practical evidence that in several situations, even in the presence of very complex phenomena, only few governing components were relevant to describe the whole dynamics. Therefore a dimensionality reduction could be invoked by enforcing the sparsity and compressibility of the solution. Modelling the quantities of interest as functions, we assumed that they could be well-approximated by a linear combination of few elements of a prescribed basis or frame. Moreover, solutions could also be sparse in terms of higher order derivatives or more general transformations. It was currently well understood that sparsity could be implemented, e.g. by imposing the solution to have minimal l1 -norm in terms of frame coefficients or few discontinuities that were essentially concentrated on regular sets of lower dimension, e.g. as for total variation minimisation or for free-discontinuity problems. The project addressed the study of general models promoting sparsity and corresponding algorithms for the recovering of sparse signals. Applications in image processing and brain imaging were considered with successful results. The project lasted 18 months and was split in the following three regions for the realisation of three main goals:

theoretical analysis of the scheme for blind source separation based on sparsity constraints in various contexts, such as inverse problems or adaptive numerical simulation

2. modelling specific applied problems of image processing and brain imaging into the setting of blind source separation under sparsity constraints and

3. numerical implementation of the algorithms on data from concrete applications in image processing and brain imaging for the validation of the models and the theoretical analysis of the numerical schemes.

A new area of science and engineering is now strongly needed in order to extract significant information from this universe of data and to understand it. This is one of the prevailing challenges of this century, i.e. the capability to organise and understand complexity. This project significantly contributed to mathematical developments in the field of data analysis. The traditional tools utilised in this sector, for instance statistics, e.g. principal and independent component analysis, and Fourier analysis, soon revealed their limits in dealing with high dimensionality, low scale and non-stationary phenomena and nonlinearity. An interdisciplinary approach beyond tradition seemed therefore to be more promising. The advent of sophisticated analysis tools from numerical harmonic analysis, e.g. wavelets and time-frequency analysis and decompositions, opened new frontiers in data analysis. Moreover, new emerging ideas were flourishing from the combination with different mathematical fields, such as variational calculus, approximation theory, optimisation and probability. In this framework we distinguished two different conceptual situations which in turn corresponded to different applications in data analysis:

1. sparse recovery, with the data that were in disposal originating from measurements of the quantity of interest which we wanted to recover

2. learning, with the data being direct realisations of a statistical distribution which was unknown. The data were examples from which we learned in order to make precise predictions.

Both these blind source problems were approached by minimising the data misfit and imposing further regularising constraints on the solution, e.g. minimal norm in a suitable Hilbert space.

The most remarkable and recent advances in data analysis were based on the practical evidence that in several situations, even in the presence of very complex phenomena, only few governing components were relevant to describe the whole dynamics. Therefore a dimensionality reduction could be invoked by enforcing the sparsity and compressibility of the solution. Modelling the quantities of interest as functions, we assumed that they could be well-approximated by a linear combination of few elements of a prescribed basis or frame. Moreover, solutions could also be sparse in terms of higher order derivatives or more general transformations. It was currently well understood that sparsity could be implemented, e.g. by imposing the solution to have minimal l1 -norm in terms of frame coefficients or few discontinuities that were essentially concentrated on regular sets of lower dimension, e.g. as for total variation minimisation or for free-discontinuity problems. The project addressed the study of general models promoting sparsity and corresponding algorithms for the recovering of sparse signals. Applications in image processing and brain imaging were considered with successful results. The project lasted 18 months and was split in the following three regions for the realisation of three main goals:

theoretical analysis of the scheme for blind source separation based on sparsity constraints in various contexts, such as inverse problems or adaptive numerical simulation

2. modelling specific applied problems of image processing and brain imaging into the setting of blind source separation under sparsity constraints and

3. numerical implementation of the algorithms on data from concrete applications in image processing and brain imaging for the validation of the models and the theoretical analysis of the numerical schemes.