Skip to main content

Big Data Challenges for Mathematics

Periodic Reporting for period 1 - BIGMATH (Big Data Challenges for Mathematics)

Reporting period: 2018-10-01 to 2020-09-30

The main domains of interest of the BIGMATH project lie in the areas of optimization, statistics, and large-scale linear algebra, which are the most relevant topics for effective machine learning techniques and ability to build good data-driven products. The problems BIGMATH considers require the development of both innovative mathematical techniques and computational procedures joined with the ability of scientists to consider “the right mathematics for the right problem”, a profile only possible with the right mix of academic profoundness and industrial hands-on experience. BIGMATH is aimed at training mathematicians with strong theoretical and practical skills, with the mind-set of curious data wizards. BIGMATH graduates cope with the major challenges of the Big Data era, and will be able to transfer effectively their knowledge to the productive world, both for economic and social benefit. The research objectives of the project are focussed on the mathematical expertise underlying the different and often intertwined mathematical challenges that Big Data are posing.
The work performed so far addressed three main areas: training, research and dissemination and outreach.
The seven ESRs recruited at project start have been directly involved in eight training modules organised in the framework of the action including one workshop, three training courses and four advanced courses. Modules focused on both scientific subjects and soft-skills topics. BIGMATH students also attended a dozen of additional training courses at local and international level and had the chance to perform presentations at project remote and face-to-face meetings as part of the update on the progress of their research programme.
The seven ESRs have spent a first 6-months period of industrial secondments during which they experienced a hands-on-training on the industrial problems that are mainly driving their research. This period has been followed by an academic period during which they have focused on the main theoretical mathematical instruments needed to develop their projects. In a first phase they have done a detailed literature review, and have got familiar with the data either provided by the industrial partners or found in suitable open data repositories. Some first attempts of application of the studied techniques and improvements, suited for their specific industrial problems, have been tested by the ESRs, with some relevant scientific results in some cases (as witnessed by the production of some preprints, submitted for publication), and which are still in progress and under exploration in other cases. Relevant results have been obtained particularly in the field of shape models for human face reconstruction and of distributed optimization for Industry 4.0. Mid-term results of the R&D activities have been described by each ESR in a report released with the aim of carry out the fellows’ mid-term evaluation. The mid-term evaluation has been organised in on-line mode and carried out by seven independent academic reviewers. With very positive assessment, all the ESRs passed the evaluation laying the foundation for further research advances.
Dissemination and outreach
Dissemination and outreach activities have been set down through: launch of the project portal; animation of project’s social media accounts; organisation of two minisymposia; presentation at MeetMeTonight 2018 and Night of Researcher 2019; participation to Hackathon; BIGMATH podcast recorded on a Portuguese web radio; upload of a seminar for non-specialists recorded on Educast platform; preparation of an e-poster; publications of scientific articles and media releases. Due to Covid-19 emergency, dissemination and outreach activities have suffered some delays: at least three planned minisymposia have been postponed and additional actions re-scheduled.
In the next period BIGMATH will cope with the following main lines of implementation:
Activities under the responsibility of the project partners will be concluded with a focus on enhancing ESRs soft skills and familiarity with both the academic and industrial landscape: presentations at conferences and scientific events will be fostered together with a close cooperation with companies and industries through team work, brand presentations and R&I activities.
Most of the ESRs will now start a new long period of industrial secondments (in some cases performed in smart working from remote, due to the Covid19 restrictions), during which they will be supervised in strict collaboration between the industrial and the academic partners. In some cases, extensions of existing theories and their mathematical study is needed, together with the development of ad hoc algorithms and related software. The ESRs will be assisted by the advisors in this process and they will be able to rely also on the competences and suggestions of other leading scientists that have been involved by the advisors to collaborate in the ESRs’ research. Expected results:
• Human face reconstruction: from the research of ESR3 we expect to find improved parametric shape and morphable models able to reconstruct complex shapes, like that of an ear, and to put into relationship the optimal parameters of the models of such specific parts of a human body with the characteristics of entire head. From the research of ESR1 and ESR2 we expect to find methods which allow to create more and more realistic virtual humans, to be employed in the entertainment industry.
• Finance: ESR4 and ESR5 will develop new classification methods in presence of unbalanced samples, with the double aim of estimating credit risk, or of predicting commercial trade volumes or unexploited capital of investment of clients of a company. Such problems require the development of techniques of discriminant analysis which rely on suitable techniques of matrix decompositions in large scale linear algebra, or on supervised machine learning techniques that must be adapted to the case of datasets with imbalanced labels.
• Industry 4.0: ESR6 will develop statistical techniques able to identify patterns in multivariate time series with binary outputs, which are able to predict the occurrence of specific events, like failures in a production line. Multinomial urn models and models based on Latent Dirichlet Allocation are still under study. ESR7 is developing some new distributed optimization techniques that may be applied to complex problems which occur frequently when many different “entities” (like sensors, individuals, etc.) can exchange information only with their neighbours, but the entire system formed by such entities must be optimized to perform a task.
Dissemination and outreach
Major effort will be devoted to strengthen dissemination of achieved results through channels such as: organisation of minisymposia, participation to external events and presentations, publication of articles on scientific reviews, promotion via the project portal and social media animation, reaching of generic public through institutional networks. In terms of impacts, at this project stage, beneficiaries confirm that the main purpose of the project activities is to contribute to enhancing the career perspectives and employability, and skills development of the seven involved researchers who are at the core of the action.
ESRs recruited
Project logo
Overview of the BIGMATH research areas