Data Augmentation for end-to-end Auditory Attention Decoding

Thumbnail Image
Issue Date
Journal Title
Journal ISSN
Volume Title
Hearing-impaired people struggle to understand others in a busy multi-speaker environment. While hearing aids currently employ noise-reduction techniques, they do not yet have information about what speaker the user wants to attend to. Neuroscientific research has shown that the human auditory cortex encodes the attended speaker differently than unattended speakers, which has led to the field of auditory attention decoding. This field seeks to decode what speaker a person is listening to based on neuroimaging data. Existing techniques like linear and non-linear stimulus reconstruction have shown mixed performance. Therefore this thesis implements two auditory attention decoding networks in the direct classification paradigm, namely a feed-forward fully connected neural network and a convolutional network. In addition, basic methods of data augmentation are investigated, and in particular their impact on the accuracy of the networks. In line with previous findings, the results showed that both networks perform at chance-level for a wide variety of hyperparameters, and that the used data augmentation methods did not improve the performance. Future work should focus on testing models on multiple independent datasets to ensure the network does not become too tailored to a specific dataset.
Faculteit der Sociale Wetenschappen