Automatic Data Creation for Automatic Speech Recognition Systems in the Air Traffic Control Domain: A Four-Step Pipeline

Neurink, Yitsz

Automatic Data Creation for Automatic Speech Recognition Systems in the Air Traffic Control Domain: A Four-Step Pipeline

Files

Neurink, Y. s-4673530-BSc-Thesis-2023.pdf (852.13 KB)

Authors

Neurink, Yitsz

Issue Date

2022-11-01

Language

en

URI

https://theses.ubn.ru.nl/handle/123456789/16044

Abstract

It is investigated how weighted Finite State Transducers (FSTs) can be used to combat one of the main problems that exist when applying Automatic Speech Recognition (ASR) to the domain of air traffic control (ATC): data sparsity, i.e. the fact that there is a lack of publicly available ATC data to train on. This is done via a four-step pipeline wherein FSTs are used to generate corpora of new sentences which are translated into Statistical Language Models (SLMs) represented as ARPA files. These corpora can act as a source of new data and the SLMs can capture the specifics of the structure of this data. In addition, the SLMs in the ARPA files can act as language models for ASR systems. The first three steps of this process are implemented and the fourth is discussed theoretically. The SLMs generated using this method are analyzed on their contents and their efficacy is discussed. While the application of this method to ATC data is useful in theory, it has shortcomings stemming from data sparsity, decisions made in this research and the nature of FSTs. Finally, it is discussed how these shortcomings could be adressed.

Supervisor

Bosche, ten, L.F.M.

Leone, F.T.M.

Faculty

Faculteit der Sociale Wetenschappen

Programme

Artificial Intelligence

Specialisation

Bachelor Artificial Intelligence

Collections

Faculteit der Sociale Wetenschappen

Full item page

Automatic Data Creation for Automatic Speech Recognition Systems in the Air Traffic Control Domain: A Four-Step Pipeline

Keywords

Files

Authors

Issue Date

Language

Document type

Journal Title

Journal ISSN

Volume Title

Publisher

Title

ISSN

Volume

Issue

Startpage

Endpage

URI

DOI

Abstract

Description

Citation

Supervisor

Faculty

Programme

Specialisation

Collections