Man vs Machine: Comparing cross-lingual automatic and human emotion recognition in background noise
Automatic emotion recognition (AER) from speech has seen major advancements the past decade, but more research is necessary. Especially cross-lingual emotion recognition in noise has not been considered in AER studies. In this study, I investigate the impact of noise and/or an unknown language on AER of four emotions and compare this to HER in the same adverse conditions. I also investigate which acoustic features play a role in cross-lingual AER in noise compared to HER. Results showed that cross-lingual AER performance was overall lower than cross-lingual HER performance. AER performance differed substantially from cross-lingual HER results. Background noise did not have an influence on cross-lingual AER. The acoustic analysis showed differences between the parameters linked to AER compared to HER for two emotions, while the acoustic parameters associated with the other two emotions were almost identical. The findings of this emphasize the importance of comparisons between AER and HER.
Faculteit der Letteren