Advanced Data Methods for Improved Tiltrotor Test and Design

Periodic Reporting for period 4 - ADMITTED (Advanced Data Methods for Improved Tiltrotor Test and Design)

Reporting period: 2023-02-01 to 2023-11-30

If an airline is able to predict when a part is going to fail and to prevent it from happening, extra costs and passenger annoyance during flight can be avoided. Combining flight analytics and sensor data from engines with customer data, airlines can better manage flight disruptions, not to mention missed connections. This is achieved by using data that the aircraft generated during flights through their deep analysis in order to detect possible malfunctions, performing predictive maintenance, anticipating problems before it is raised.

Traditional data-mining methods are effective on uniform data sets such as flight tracking data or weather. Integrating heterogeneous data sets introduces complexity in data standardization, normalization, and scalability. The variability of underlying data warehouse can be leveraged using big data infrastructure for scalability to identify trends and create actionable information.
The massive availability of data requires complex and performing architectures to support deep large-scale analysis.. Furthermore, data can be so huge that an intelligent support shall be provided by the platform itself, that is, the end user shall be reinforced and guided in the analysis by means of an intelligent support.
This can be only achieved by means of the adoption of novel approach to process large amount of data and extract useful information (data mining, machine learning, AI).
The platform as well shall be able to support all analyses without requiring huge investment: this can be achieved by adopting COTS components for the HW platform and state of the art solutions for SW platform: big data management (Hadoop), analytic (Spark).
Furthermore, the platform and related SW and algorithms shall be exploitable in different contexts as well, for example both in on-premises adoption (as proposed in the current document) and on cloud, according to different exploitable paths. In fact, although for a specific context (NGTCR), both the HW architecture, the SW stack and large part of the algorithms can be exploited also in different contexts and domains (e.g. fixed wings aircraft).
Main Objectives are:
• Define the most appropriate infrastructure to support large amount of data collected during flights test. Both HW platform and SW toolset to support storing, retrieval, data analysis
• Define the best approach to extract useful information from the data recorded according to the identified main purposes
Main Specific Objectives
• Select the appropriate platform for big data support
• Select the most promising techniques to support data analytics
• Allow analysis of flight data in combination with external data (i.e. weather)
• Support computation of prediction or suggestion based on data
• Implementation of novel predictive algorithms based on machine learning techniques

During the first reporting period WP1 activities concentrated in the first important result which is the selection, installation and connection of the enabling big data infrastructure of the project which will be used by the project during the five years. The enabling infrastructure based on a complex, yet flexible HW/SW combination. From HW perspective the system is composed by: 1)a baremetal infrastructure of 4 nodes optimised for storage and cpu-intensive computations; 2) a Hyperconvergent Nutatix architecture built upon the beforementioned 4 nodes that is configured to expose the user a 6 nodes cluster capable of holding 80Tb of data and 3) a single node powerful workstation equipped with two best-in-class V100 Nvidia GPUs for deep learning applications. From a SW/application perspective a 6 node Cloudera cluster configured with all the Hadoop related services (e.g. Spark, Hive, Kudu, Impala, Hue) needed to run ADMITTED ETL and data analysis jobs.
A second important result in WP1 is the creation of the Query Catalogue which realize the definition of data containers, classes and algorithms to be used in the implementation of the most popular queries. This building block approach have been defined and structured to standardize data access and query builder by data scientists.
In Tasks 2.1 the first WP2 result concerns the conclusion of the configuration of the big data cluster properly connected to the incoming data sources realising the full data ingestion; the ingestion includes real data which are fed into the platform by means of ETL programs.
In Task 2.2 algorithms for automatic anomaly detection have been developed. In particular, the methodologies have been designed to recognize spikes and holes in the flight data time series. The algorithms have been conceived based on a list of anomalous signals hand classified by the Topic Leader, and then implemented and deployed on the ADMITTED cluster for large-scale testing on the whole dataset.
In task 2.4 several initial AI models have been built to identify the part of a mission (flight regimes) that has been flown (e.g. descend, a specific turn etc), based on the recorded data of a set of flight parameters. Currently the flight regimes are manually entered into the database Based on a limited set of features derived from the flight parameters the flight regimes are identified. While the initial algorithms are already very promising, in the next period a focus will be on improving them. This will be done by both looking at the inputs (which flight parameters and which features) and by looking at the definition of the flight regimes.
In task 2.6 the task about predictive modelling, two use cases have been studied. In one of the use cases, dealing with accelerations, the (high) frequency content is dominant , while in the second use, dealing with forces and moments it plays a lesser role.
For the second use case, first versions of predictive data-driven AI models have been designed based on a set of flight parameters. The resulting correlation between data and prediction is high. For the first use case, dealing with a set of accelerations, the focus has been on reducing the time series to features that are representative for the whole data set. Having these features building the predictive models will then be a second step. For the current use case extracting the proper features is on-going; for some signals the features are representative, while for others more or other features seem to be necessary.

The project intends to support the Topic Leader data scientists during their analyses on flight data according to the following key steps:
o Definition of a suitable computing cluster, having enough computation power and disk space to support several aircraft data
o Definition of a ready to use set of basic analytic features, to be adopted for data analysis
o Adoption of an intelligent support in data analytics by means of machine learning based approach to specifically address the key issues of the aeronautic domain
o Adoption of novel AI techniques to better support the understanding of flights occurrences. In fact, one of the key features of the proposed platform is the ability to combine heterogeneous data coming from different sources: flight data, time series form sensors, environment conditions, aircraft settings, pilot interaction and other analyses

Periodic Reporting for period 4 - ADMITTED (Advanced Data Methods for Improved Tiltrotor Test and Design)

Share this page

Download