Learning to localize and classify spoken digits: a comparison of two SNN-frameworks

Keywords
Loading...
Thumbnail Image
Issue Date
2022-01-01
Language
en
Document type
Journal Title
Journal ISSN
Volume Title
Publisher
Title
ISSN
Volume
Issue
Startpage
Endpage
DOI
Abstract
The human brain works very efficiently and accurately in tasks that require localization and recognition of sounds. In the brain, the precise spike timing of spike trains is used to convey information among biological neurons. Motivated by this efficient information processing capability of the brain, it makes sense to try to mimic this process with the use of spiking neural networks for computational modeling. In this thesis, two SNN-frameworks are proposed that are able to learn to localize and classify a set of spoken digits. Both frameworks make use of Legendre Memory Units and convolution layers, but their overall structure differs. The first framework uses one neural network to classify and localize the digits, whereas the second framework uses ten sub-networks to localize each digit separately. Results show that the first framework performs better in terms of accuracy and computational costs but the structure of the second frameworks provides more flexibility. The described frameworks could potentially be useful in modeling human speech recognition and localization, but still require a lot of further research in order to be able to perform in the real world.
Description
Citation
Faculty
Faculteit der Sociale Wetenschappen