Automatic extraction of characterizing features for non-native Dutch read speech

Bosland, Rosa

Automatic extraction of characterizing features for non-native Dutch read speech

Files

Bosland, R. s-1010114-BSc-Thesis-2022.pdf (610.38 KB)

Authors

Bosland, Rosa

Issue Date

2022-01-25

Language

en

URI

https://theses.ubn.ru.nl/handle/123456789/15833

Abstract

Although finding characteristic features for English atypical speech has been the topic for many researches, not much research has been done for atypical Dutch speech. In 2008, the JASMIN speech corpus was completed, a spoken Dutch corpus containing children, elderly and non-native Dutch speakers [1]. In this thesis, non-native Dutch read speech from JASMIN is compared to native read speech to find out which features are characteristic for non-native Dutch speech. By automatically computing 103 word level Praat and eGeMAPS features from speech recordings and transcriptions, ranking these features with a Recursive Feature Elimination (RFE) method, classifying them with binary comparisons using a Support Vector Machine (SVM), and finally evaluating them using statistical tests, this research succeeded in automatic extraction of characteristic features for non-native Dutch read speech. Through binary comparisons with native speech, 93 out of 103 features were found to be significantly different. Two characteristic and partly overlapping sets of features were found; the first set based on the RFE ranking, the second based on an individual effect size ranking. Both sets support the hypotheses that a lower speaker volume and lower order Mel-Frequency-Cepstral-Coefficients are characteristic for non-native Dutch speech, and show indications of a slower reading pace for non-natives. Moreover, formant related features were prevalent in both rankings, indicating a different shape of the vocal tract owing to deviations in non-native pronunciation compared to native speakers.

Supervisor

Strik, W.A.J.

Heijden, van der, K.A.

Faculty

Faculteit der Sociale Wetenschappen

Programme

Artificial Intelligence

Specialisation

Bachelor Artificial Intelligence

Collections

Faculteit der Sociale Wetenschappen

Full item page

Automatic extraction of characterizing features for non-native Dutch read speech

Keywords

Files

Authors

Issue Date

Language

Document type

Journal Title

Journal ISSN

Volume Title

Publisher

Title

ISSN

Volume

Issue

Startpage

Endpage

URI

DOI

Abstract

Description

Citation

Supervisor

Faculty

Programme

Specialisation

Collections