Automatic Data Creation for Automatic Speech Recognition Systems in the Air Traffic Control Domain: A Four-Step Pipeline
It is investigated how weighted Finite State Transducers (FSTs) can be used to combat one of the main problems that exist when applying Automatic Speech Recognition (ASR) to the domain of air traffic control (ATC): data sparsity, i.e. the fact that there is a lack of publicly available ATC data to train on. This is done via a four-step pipeline wherein FSTs are used to generate corpora of new sentences which are translated into Statistical Language Models (SLMs) represented as ARPA files. These corpora can act as a source of new data and the SLMs can capture the specifics of the structure of this data. In addition, the SLMs in the ARPA files can act as language models for ASR systems. The first three steps of this process are implemented and the fourth is discussed theoretically. The SLMs generated using this method are analyzed on their contents and their efficacy is discussed. While the application of this method to ATC data is useful in theory, it has shortcomings stemming from data sparsity, decisions made in this research and the nature of FSTs. Finally, it is discussed how these shortcomings could be adressed.
Faculteit der Sociale Wetenschappen