While speech recognition is becoming increasingly sophisticated in consumer products, some sectors that could really benefit are lagging behind. Air traffic control digital assistants access air and ground traffic sensors, but voice communication between controllers and pilots is not automatically available to them, despite being the most valuable source of information. To digitise voice communications, controllers must transcribe the information, which takes up to one third of their time. The HAAWAII (Highly Automated Air Traffic Controller Workstations with Artificial Intelligence Integration) project, funded within the framework of the SESAR Joint Undertaking, has developed new speech recognition software based on deep neural networks. “As Alan Turing pointed out in the 1950s, speech recognition is not speech understanding, so we also worked on that,” says Hartmut Helmke, project coordinator from the German Aerospace Center, the project host. “We’ve achieved a word error rate of under 5 %, translating to a command recognition rate of over 85 % for air traffic controllers.”
Readback error detection for ATM
Speech recognition of air traffic controllers and pilots remains challenging due to local versions of standardised phraseology, different English accents (the international language of aviation), different speaking speeds and noisy channels. HAAWAII partners NATS and Isavia, the United Kingdom’s and Iceland’s air navigation service providers respectively, recorded over 500 hours of controller-pilot voice communications. Forty hours of this was then manually transcribed word-for-word. After supplying the HAAWAII speech recogniser software with just one hour of manually transcribed data, word recognition improved twofold. After training it on all the transcribed and untranscribed data, the word recognition rate was over 95 % for controllers and over 90 % for pilots. “The real problem with incorrectly recognised words is when they relate to safety-critical information, such as call signs or waypoint names. Combining voice with radar data enabled our system to improve at the semantic level. For example, we achieved a recognition rate of 97 % for aircraft call signs used by controllers,” remarks Helmke. Machine learning was also used to create a Readback Error Detection Assistant (REDA). Readback errors are where, for example, a controller gives clearance for a pilot to climb to 7 000 feet but the pilot repeats this as 8 000 feet, risking a collision if undetected. The REDA generates an alert when these errors occur. The REDA was evaluated in a laboratory by five air traffic controllers from Iceland. “The number of readback errors detected during these lab tests was over 80 % from offline evaluations of transcriptions from real-life data, with a false alarm rate below 20 %,” adds Helmke. Trials with NATS controllers are planned for this year, with Isavia also intending to demonstrate the REDA in their own operational environment.
Increased accuracy and reduced workload
By reducing the workload and increasing the accuracy of air traffic controllers, effective speech recognition and understanding could significantly increase air safety. Thousands of hours of transcriptions also offer air navigation service providers useful management information, such as how often certain commands are given or repeated per aircraft, both suggesting high workloads. Speech recognition could also be used to support on-the-job simulations, making training cheaper and possible remotely. “Our prototype has worked around London, Europe’s most congested air space and also in Isavia’s airspace, covering over 5 000 000 square kilometres. It understood pilots’ voices, with a word error rate under 10 %, despite accents from around the world, not to mention very noisy voice channels,” concludes Helmke.
HAAWAII, speech recognition, air traffic control, words, commands, airspace, workload, air safety, transcriptions, error detection, voice, radar