Understanding the Features of a Convnet Trained for Phone Recognition

dc.contributor.advisorGerven, M.A.J. van
dc.contributor.advisorGüçlü, U.
dc.contributor.authorKemper, D.
dc.date.issued2015-07-13
dc.description.abstractFor convolutional neural networks (convnets) trained for image recognition it is known what the features represent. However, for convnets trained for phone recognition this is not known yet. This study tried to answer the following question: What do the features of such a convnet represent? A convnet with three convolutional layers was trained on the TIMIT phone recognition task and a deconvnet was applied to obtain visualizations of its features. In experiment 1 the deconvnet was applied on the activation caused by the top 4 input phones per feature. In experiment 2 it was applied on the activation caused by the top 3 average phones per feature. Phone label analysis reveals consonant-, front vowel- and back vowel-sensitive features in the third layer. For both experiments, the visualizations were hard to interpret. It could be that visualizing features that represent aspects of audio is not the best way to gain insight into the features, although more experiments that use different convnet architectures should be run to confirm this. Future research could search for other ways to gain insight into the representations of the features, by for example further exploring the possibilities of phone label analysis.en_US
dc.identifier.urihttp://theses.ubn.ru.nl/handle/123456789/217
dc.language.isoenen_US
dc.thesis.facultyFaculteit der Sociale Wetenschappenen_US
dc.thesis.specialisationBachelor Artificial Intelligenceen_US
dc.thesis.studyprogrammeArtificial Intelligenceen_US
dc.thesis.typeBacheloren_US
dc.titleUnderstanding the Features of a Convnet Trained for Phone Recognitionen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Kemper, D._BA_Thesis_2015.pdf
Size:
899.51 KB
Format:
Adobe Portable Document Format
Description:
Scriptietekst