CORDIS - Forschungsergebnisse der EU
CORDIS

Highly Automated Air Traffic Controller Workstations with Artificial Intelligence Integration

Periodic Reporting for period 2 - HAAWAII (Highly Automated Air Traffic Controller Workstations with Artificial Intelligence Integration)

Berichtszeitraum: 2021-06-01 bis 2022-11-30

The Controller Working Position (CWP) is the centre of the human-machine interactions for Air Traffic Controllers (ATCos). The tools and platforms used by air traffic controllers shape and inform their way of operating, their mental models and ultimately their performance. This in turn promotes the safe and efficient management of air traffic. Voice communications between ATCos and pilots is still not digitalised and, therefore, not accessible for machine analysis.
To change the current status, the SESAR Master Plan vision includes increasing digitalisation and automation of Air Traffic Control (ATC). Digitization of ATCo-pilot communication by Automatic Speech Recognition (ASR) is a cornerstone. Data communication, i.e. Controller Pilot Data Link Communication (CPDLC) or text-based transmission of data and ASR are no competing, but complementary. ASR can e.g. replace ATCos mouse-based inputs into the assistant systems. Voice is the most efficient way of human-human communication. Why not benefitting from ASR in human-machine communication? Recently, solution PJ.16-04 of SESAR 2020 Wave-1 has demonstrated that the current ASR systems are mainly targeting the every-day consumer market and are not suited to safety and time critical application areas such as Air Traffic Management (ATM). As shown in the past, trust and acceptance by ATCos of low fidelity tools is an obstacle for deployment.
Nevertheless, several past projects have indicated the usefulness and utility of ASR: (i) AcListant® quantified the benefits with respect to workload reduction and performance increase, (ii) MALORCA project demonstrated commercial viability and enhanced models, (iii) SESAR 2020 solution PJ.16-04 explored industrial integration and requirements building.
So far, these precursor projects have focused on research and development of a set of models, further deployed in relatively simple and controlled application domains. The HAAWAII project proposes to target complex and challenging environments and, more importantly, wider applications of automatically recognized voice communications. More specifically, HAAWAII proposes two general objectives:
1. Research and develop data-driven (machine learning oriented) approaches to be deployed for novel and complex environments from two large ANSPs, demonstrating an increased validity of the tools;
2. Demonstrate the wider applicability of the tools in ATM, focussing on generating benefits for the ATCos and the ANSPs, i.e. reducing workload, increasing both efficiency and safety.
Overall, HAAWAII intends to focus on the following applications:
• Pilot readback error detection,
• Modelling and being able to anticipate controller behaviour,
• Pre-filling radar labels and CPDLC messaging using the automatic speech recognition,
• Human performance metric extraction.
NATS and Isavia ANS have provided each more than twenty hours of transcribed voice communication between air traffic controllers (ATCos) and pilot. Additionally , 380 hours from NATS and 60 hours from Isavia ANS of untranscribed voice recordings including the surveillance data are available.
• Roughly 4 hours of the manual transcriptions for each ANSP have also annotated, i.e. the sematic ATC concepts have been extracted. “Lufthansa one two alfa after dexon descend ten thousand feet or below reduce two twenty” results on semantic level into “DLH12A DESCEND 10000 ft OR_BELOW WHEN PASSING DEXON, DLH12A SPEED 220 none”. The last “none” means for example, that the unit of the speed was not provided.
• The ontology, i.e. the rules which mapped transcriptions to annotations, have been extended by the HAAWAII team and made available to the other speech recognitions projects.
• The HAAWAII team has created more than 30 blogs on www.haawaii.de.
• The HAAWAII team has created more than 20 (peer reviewed) publications to conferences and journals.
• The main application of the HAAWAII project was to develop a Readback Error Detection Assistant (REDA).
• Two different readback error detection approaches have been developed and tested. The results have been presented in a paper at the SESAR Innovation Days in Budapest 2022. The main results are:
o A Readback Error Detection Rate of 82% with a false alarm rate of 67% is achieved.
o Word error rates of below 3% have been achieved for controller utterances from both NATS and Isavia ANS on unseen test data from the ops room environment.
• The HAAWAII partners have organized two stakeholder workshops in Vienna 2021 and one in Swanwick UK in 2022. Both were attended by more than 50 people.
• The HAAWAII partners have organized two demo days, one in Reykjavik in May 2022 and one in Swanwick UK in September 2022.
• The HAAWAII team suggested modifications of the enablers (EN) and of the operation improvement steps (OI), which will be considered in the next SESAR calls.
• Before improving the false alarm rate of readback error detection, evaluations are necessary which find out, what really helps the ATCos. More readback error samples directly from the ops room are necessary. HAAWAII REDA could be integrated already now into the ops room for offline analysis of possible readback error use cases, so that only some communications of ATCo-pilot need to be manually analysed by subject matter experts.
Already Alan Turing pointed out in 1952 that speech recognition, i.e. just getting the words, is not speech understanding, i.e. getting also the semantics or the word combinations. The HAAWAII project developed a special architecture for this purpose, which is already reused in SESAR industrial research projects. The HAAWAII architecture was already successfully used in different projects. HAAWAII architecture means
• to use Assistant Based Speech Recognition (ABSR), which integrates information of the available callsigns from flight plan and surveillance data in Speech-to-Text (so called callsign boosting) and Text-to-Concept,
• to make very clear, that speech recognition (Speech-to-Text) does not include speech understanding (Text-to-Concept) ,
• to use context from the previous utterance in Text-to-Concept, e.g. “two zero zero thank you” in a pilot readback is very probable an altitude readback, and not a speed or heading readback, if the ATCo has just given a CLIMB command to flight level 200),
• to integrate command validation in Text-to-Concept phase,
• to have the same acoustic and language model for ATCo and pilot utterances,
• to have a separate block for voice activity detection (VAT), which either relies on push-to-talk (PTT) availability or needs to evaluate the input wave signal in more detail
• to repair over- or under-splitting in the Text-to-Concept phase.
use-cases-haawaii.png