European Commission logo
English English
CORDIS - EU research results

Big Data Challenges for Mathematics

Periodic Reporting for period 2 - BIGMATH (Big Data Challenges for Mathematics)

Reporting period: 2020-10-01 to 2022-09-30

The main domains of interest of the BIGMATH project lie in the areas of optimization, statistics, and large-scale linear algebra, which are the most relevant topics for effective machine learning techniques and ability to build good data-driven products. The problems BIGMATH considered require the development of both innovative mathematical techniques and computational procedures joined with the ability of scientists to consider “the right mathematics for the right problem”, a profile only possible with the right mix of academic profoundness and industrial hands-on experience. BIGMATH reached its aim of training a group of mathematicians with strong theoretical and practical skills, with the mind-set of curious data wizards. BIGMATH graduates coped, in their research projects, with the major challenges of the Big Data era, and, thanks to the specific received training, will be able to effectively transfer their knowledge to the productive world, both for economic and social benefit. The research objectives of the project were focussed on the mathematical expertise underlying the different and often intertwined mathematical techniques needed for Big Data analysis. The close collaboration between both academic and industrial partners of the project was the key ingredient to reach the objectives and to contribute to the scientific advance in this field.
The work performed so far addressed three main areas: training, research and dissemination and outreach.
The seven ESRs recruited at project start have been directly involved in ten training modules organised in the framework of the action including two workshops, four training courses and four advanced courses. Modules focused on both scientific subjects and soft-skills topics. BIGMATH students also attended about twenty additional training courses at local and international level and had the chance to perform presentations during both remote and face-to-face meetings with industries and academia, organized on a regular basis to update the supervisory board on the progress of their research programme.
Each ESR spent at least 18 months of industrial secondments during which a hands-on-training on the industrial problems driving their research was spent. This period has been interspersed by academic periods during which ESRs have focused on the mathematical instruments needed to develop their projects. In a first phase they have done a detailed literature review, and have got familiar with the data either provided by the industrial partners or found in suitable open data repositories. Some improvements with respect to state-of-the-art mathematical techniques, suited for their specific industrial problems, have been implemented by the ESRs, with relevant scientific results in some cases, as witnessed by the production of several publications or preprints, submitted for publication. Relevant results have been obtained particularly in the field of shape models for human face reconstruction and of distributed optimization for Industry 4.0. Only two ESRs out of 7 defended their theses at the moment, because of the delays due to both the Covid19 restrictions, and to the diverse duration of the PhD programs in which the ESRs are enrolled.
Dissemination and outreach
Early Stage Researchers have largely contributed to promote and spread out behind the academia walls the results of their projects, addressing both the research and industry communities. They contributed to plan, organise and carry out events planned by the project partners and personally presented talks at conferences, workshops, internal meetings, also preparing supporting materials and promoting through the web and social media. Considering the whole project timeframe ESRs participated to more than 30 events.
At the project end the following results can be summarised:
The wide variety of training activities organised and carried out by the BIGMATH team focused on enhancing ESRs soft skills and familiarity with both the academic and industrial landscape: presentations at conferences and scientific events have been fostered together with a close cooperation with companies through team work, brand presentations and R&I activities.
Most of the ESRs had long periods of industrial secondments (in some cases performed in smart working from remote, due to the Covid19 restrictions), during which they worked in strict collaboration between the industrial and the academic partners and experienced the work habits in a business oriented environment. In some cases, extensions of existing theories and their mathematical study was needed, together with the development of ad hoc algorithms and related software. The ESRs have been assisted by the advisors in this process. Main obtained results:
• Human face reconstruction: from the research of ESR3 we were able to set up and study the mathematical properties of a full pipeline to reconstruct complex shapes, like that of an ear,in a realistic way. The pipeline has a high degree of flexibility to be adapted also to other use cases, posing thus the first step towards general purpose morphable models. From the research of ESR1 and ESR2 new methods, with computational advantages and a higher degree of interpretability of the existing ones, have been studied, which allow to create more and more realistic virtual humans, to be employed in the entertainment industry.
• Finance: ESR4 and ESR5 developed new classification methods in presence of unbalanced samples, with the double aim of estimating credit risk, or of predicting commercial trade volumes or unexploited capital of investment of clients of a company. Such problems required the development of techniques of discriminant analysis which rely on suitable techniques of matrix decompositions in large scale linear algebra, or on supervised machine learning techniques that had to be adapted to the case of datasets with imbalanced labels.
• Industry 4.0: ESR6 studied statistical techniques able to identify patterns in multivariate time series with binary outputs, which are able to predict the occurrence of specific events, like failures in a production line. Multinomial urn models, models based on Latent Dirichlet Allocation and models based on survival analysis have been tested on a specific industrial problem. The tests have highlighted which kind of information is crucial to collect to solve the problem. ESR7 developed some new distributed optimization techniques that may be applied to complex problems which occur frequently when many different “entities” (like sensors, individuals, etc.) can exchange information only with their neighbours, but the entire system formed by such entities must be optimized to perform a task.
The impact of the Action can be measured in terms of enhancing the career perspectives and employability of ESRs and contribution to their skills development. More specifically the ESRs learned to work on challenging industrial problems; to develop skills in interdisciplinary and international team-working; to increase their ability to apply tailor-made solutions; to leverage on long-term collaboration opportunities, and on contacts with leading researchers and networks of mathematicians. Additionally the project had a positive impact on knowledge transfer to industry. In fact all the research streams followed by the ESRs can be translated into business solutions, immediately or in a near future.
ESRs recruited
Project logo
Overview of the BIGMATH research areas
ESRs group