A biologically-inspired recurrent neural network model for multi-source sound categorization and localization

Thumbnail Image
Issue Date
Journal Title
Journal ISSN
Volume Title
Humans are continuously exposed to a variety of sounds. These sounds are localized and categorized easily by humans using spatial cues and possessing an auditory system with two distinct pathways, the dorsal stream and the ventral stream. I will present a recurrent neural network model, whose architecture is inspired by the structure of the two pathways in the human brain, that performs the task of localizing and categorizing sounds. The model is trained on sound scenes in an anechoic environment; each scene contains two different real-life sounds spatialized in the frontal hemifield. After initial training, the architecture is modified to achieve better results. The outcome is a network architecture that is slightly biased towards the localization tasks, which is reasonable since this is the more difficult of the two tasks. The experiment results show that the model performs well in both tasks on an unseen set of the original database. However, it performs much worse on an independent evaluation set; thus, it does not generalize. Nonetheless, if one can remove a strong bias that causes a large decrease in performance, this experiment highlights possibilities for a similar network structure in future experiments. For example, one could perform a similar experiment with more complex sound scenes, such as scenes with a reverberant environment, including background noise.
Faculteit der Sociale Wetenschappen