Text-based video genre classification using multiple feature categories and categorization methods

Keywords

Loading...
Thumbnail Image

Issue Date

2017-07-13

Language

en

Document type

Journal Title

Journal ISSN

Volume Title

Publisher

Title

ISSN

Volume

Issue

Startpage

Endpage

DOI

Abstract

The aim of this work is to categorize movies into genres using text-based features. Textual, syntactical and content-specific features are extracted from subtitles in the SUBTIEL corpus. The effectiveness of these three feature types is then compared using five algorithms (AdaBoost, C4.5, Naive Bayes, Random Forest, and SVM) and four methods are tested to combine these features (supervector, add-rule meta-classifier, product-rule meta-classifier, algorithm-based meta-classifier). The experimental results show that of the three feature types, the content-specific features result in the most accurate classifier. Furthermore, it is found that the Random Forest and SVM techniques are the two most accurate algorithms and that combining the textual, syntactical and content-specific features results in a more accurate classifier. However, the effectiveness of combining these three classifiers is largely dependent on the combination method: the algorithm-based meta classifier yields the largest improvement over the individual feature type classifiers.

Description

Citation

Faculty

Faculteit der Letteren