Reconstructing Speech Input from Convolutional Neural Network Activity

Keywords
Loading...
Thumbnail Image
Issue Date
2015-07-13
Language
en
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Convolutional Neural Networks (CNNs) applied to the auditory domain have achieved great results. However, little research has been performed to uncover the underlying mechanisms that allow auditory CNNs to reach these achievements. This exploratory research attempts to help uncover these mechanisms by using a CNN's activation patterns generated by speech inputs to reconstruct those inputs. It is found that training a decoder to decode activity patterns to a preliminary reconstruction and consequently fine-tuning that reconstruction through further back propagation generates the best results. The reconstructions show that the network preserves a good representation of the input up to and including the fully connected units. Reinforcingly, this representation appears to be suited to speech, instead of, for example, audio in general. Furthermore, it becomes apparent that the network is insensitive to input intensity as well as to the input's activation scale on the time-domain. Further research is required to discover more properties of CNNs applied to the auditory domain.
Description
Citation
Faculty
Faculteit der Sociale Wetenschappen