Reconstructing Speech Input from Convolutional Neural Network Activity

Keywords

Loading...
Thumbnail Image

Issue Date

2015-07-13

Language

en

Document type

Journal Title

Journal ISSN

Volume Title

Publisher

Title

ISSN

Volume

Issue

Startpage

Endpage

DOI

Abstract

Convolutional Neural Networks (CNNs) applied to the auditory domain have achieved great results. However, little research has been performed to uncover the underlying mechanisms that allow auditory CNNs to reach these achievements. This exploratory research attempts to help uncover these mechanisms by using a CNN's activation patterns generated by speech inputs to reconstruct those inputs. It is found that training a decoder to decode activity patterns to a preliminary reconstruction and consequently fine-tuning that reconstruction through further back propagation generates the best results. The reconstructions show that the network preserves a good representation of the input up to and including the fully connected units. Reinforcingly, this representation appears to be suited to speech, instead of, for example, audio in general. Furthermore, it becomes apparent that the network is insensitive to input intensity as well as to the input's activation scale on the time-domain. Further research is required to discover more properties of CNNs applied to the auditory domain.

Description

Citation

Faculty

Faculteit der Sociale Wetenschappen