Automatic Speech Recognition and Call Sign Detection on Air Traffic Control Data

Thumbnail Image
Journal Title
Journal ISSN
Volume Title
Air tra c control (ATC) is an area of work with high workload for pilots and controllers. Automatic speech recognition (ASR) and call sign detection (CSD) could help reduce this workload. There are many pre- existing systems of automatic speech recognition, such as the sequence-to- sequence time-depth separable architecture by Hannun et al. (2019). Not many studies exist on call sign detection, one of the few being the Airbus challenge, where the team of Gupta et al. (2019) applied a bi-directional long short term memory and conditional random eld classi er architec- ture. This study aims to investigate how these pre-existing systems per- form on a di erent corpus of ATC data, and further, how errors of either system a ect the nal accuracy of the systems in sequence. The corpus in- vestigated here is the Air Tra c Control Communication (ATCC) corpus by Sm dl (2011). The ASR and CSD systems were trained on the ATCC data and tested both individually and in combination. The research showed that ASR per- forms poorly with a word error rate of 33.08%, while CSD achieves a good F-score of 0.8509, and the combined systems' performance is again quite poor with an F-score of 0.4931. The results indicate that the combined systems' performance is strongly in uenced by transcription errors of the ASR system, more so than by errors of the CSD system.
Faculteit der Sociale Wetenschappen