Detecting mispronunciation of digit nine in ATC communciation using keyword spotting
Keywords
Loading...
Authors
Issue Date
2020-07-01
Language
en
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Air Traffic Control (ATC) communication is an important process to ensure aviation safety. A minor
mistake can lead to a disastrous event. Therefore, communication mistakes by pilots and controllers should
be prevented. Previous studies have investigated the communication mistakes that are being made and its
factors. However, information about automatically detecting communication mistakes are still inadequate,
while it can provide better insights into the mistakes being made and help prevent them. One of the
common mistakes made is the mispronunciation of the digit nine. Therefore, this thesis aims to detect
the mispronunciation of the digit nine by pilots and controllers. A keyword spotting system based on
convolutional recurrent neural networks by Kim and Nam (2019) is used to detect mispronunciations of the
digit nine in ATC audio fragments. Furthermore, three different class imbalance techniques are explored to
improve the model performance: random oversampling, weighted random sampling and weighted crossentropy
loss. The results of the techniques are analyzed both individually and comparatively to determine
which technique is best suited for the model and dataset. The results of this thesis indicate that the model
with weighted cross entropy-loss can detect the pronunciations significantly above chance level. However,
further improvement on the model is still necessary to achieve at least the same results as Kim and Nam
(2019) and provide aid in reducing the ATC communication mistakes.
Description
Citation
Supervisor
Faculty
Faculteit der Sociale Wetenschappen