Skip to main content

Automatic collection and processing of voice data from air-traffic communications

Periodic Reporting for period 1 - ATCO2 (Automatic collection and processing of voice data from air-traffic communications)

Reporting period: 2019-11-01 to 2020-10-31

ATCO2 project aims at developing a unique platform allowing to collect, organize and pre-process air-traffic control (voice communication) data from air space. The platform will be operated over the long-term and will allow various kinds of bodies to access the data to be used to develop different kind of AI (machine learning) applications, specifically those related to automatic recognition of voice recordings of air-traffic controllers.

As voice is the most natural way of communication, it is currently recognized as a priority alternative in near-future human-machine interaction. Human-machine interfaces using a comprehensive speech recognizer are proven to be highly efficient so it is expected to provide benefits in any currently foreseeable technological environment in ATM.

Overall objectives are (i) to develop a sustainable platform, (ii) collect large volume of real-time and offline data from VHF channels, (iii) implement and integrate state-of-the-art machine learning technologies supporting active learning, (iv) expand community of contributors and (v) fully support legal and ethical compliance.
The project has ended the first year (M1-M12). Several objectives and goal were already achieved. More specifically:
- The project partners have finalised the VHF receiver, objectively evaluated through several measures
- The project partners have started to develop a back-end platform allowing to transfer the collected data from the receivers operated by the data feeders.
- The project partners quickly started to exploit the VHF data (i.e. data available from different sources). Further, our own VHF data started to be collected, in the project through our own receivers.
- The collected data allowed to train and evaluate first versions of machine learning models (including automatic speech recognition, callsign detection, automatic alignment of the VHF recordings with the ADS-B data available at the OpenSkyNetwork cloud, automatic extraction of concepts from text, etc.)
- First set of manually verified voice recordings and contacting first group of active contributors.
- Several studies related to legal compliance and especially processing of personal data were performed.
Significant improvements were already achieved in performing automatic speech recognition of air-traffic communication. This specifically also includes automatic detection of callsign, training using automatically generated labels, and recognising concepts from the recognized text.

The performance allow in a wider-context to:
- Exploit large amount of VHF data to train the AI (machine learning) models
- Integrate the developed automatic speech recognition systems into the real-world applications (i.e. ATM)
An overview of ATCO2 project