Music Emotion Recognition using Multi-Output Gaussian Processes
Keywords
Loading...
Authors
Issue Date
2021-01-19
Language
en
Document type
Journal Title
Journal ISSN
Volume Title
Publisher
Title
ISSN
Volume
Issue
Startpage
Endpage
DOI
Abstract
Recognition of emotion in music is an important subject within the domain
of Music Information Retrieval. When emotion is de ned by the continuous
parameters arousal and valence, the problem of emotion recognition can
be solved with a regression approach. In general, arousal and valence are
assumed to be independent, however this paper investigates if including
a correlation between the variables will improve the accuracy of emotion
recognition in music. A Gaussian Processes model is proposed to solve this
regression problem. Gaussian Processes (GP) return a distribution over
functions which is described by a mean and a covariance function. This
means that the GP are
exible in describing data and can make accurate
predictions based on a small amount of training input. A Multi-Output
GP (MOGP) model is used to capture the correlation between arousal and
valence. The MOGP model adds an extra dimension to the covariance
matrix to include the covariance of the outputs. The MOGP is compared
with a normal GP and a Singular Value Regression model on the task of
predicting arousal and valence values for songs using auditory features. The
GP and MOGP obtain equal accuracies for their predictions. However, the
predictions of the MOGP have lower variances, which means that the MOGP
is more reliable to sample from.
Description
Citation
Supervisor
Faculty
Faculteit der Sociale Wetenschappen
