Music Emotion Recognition using Multi-Output Gaussian Processes

Keywords

Loading...
Thumbnail Image

Issue Date

2021-01-19

Language

en

Document type

Journal Title

Journal ISSN

Volume Title

Publisher

Title

ISSN

Volume

Issue

Startpage

Endpage

DOI

Abstract

Recognition of emotion in music is an important subject within the domain of Music Information Retrieval. When emotion is de ned by the continuous parameters arousal and valence, the problem of emotion recognition can be solved with a regression approach. In general, arousal and valence are assumed to be independent, however this paper investigates if including a correlation between the variables will improve the accuracy of emotion recognition in music. A Gaussian Processes model is proposed to solve this regression problem. Gaussian Processes (GP) return a distribution over functions which is described by a mean and a covariance function. This means that the GP are exible in describing data and can make accurate predictions based on a small amount of training input. A Multi-Output GP (MOGP) model is used to capture the correlation between arousal and valence. The MOGP model adds an extra dimension to the covariance matrix to include the covariance of the outputs. The MOGP is compared with a normal GP and a Singular Value Regression model on the task of predicting arousal and valence values for songs using auditory features. The GP and MOGP obtain equal accuracies for their predictions. However, the predictions of the MOGP have lower variances, which means that the MOGP is more reliable to sample from.

Description

Citation

Faculty

Faculteit der Sociale Wetenschappen