Text-based video genre classification using multiple feature categories and categorization methods

Keywords
Loading...
Thumbnail Image
Issue Date
2017-07-13
Language
en
Document type
Journal Title
Journal ISSN
Volume Title
Publisher
Title
ISSN
Volume
Issue
Startpage
Endpage
DOI
Abstract
The aim of this work is to categorize movies into genres using text-based features. Textual, syntactical and content-specific features are extracted from subtitles in the SUBTIEL corpus. The effectiveness of these three feature types is then compared using five algorithms (AdaBoost, C4.5, Naive Bayes, Random Forest, and SVM) and four methods are tested to combine these features (supervector, add-rule meta-classifier, product-rule meta-classifier, algorithm-based meta-classifier). The experimental results show that of the three feature types, the content-specific features result in the most accurate classifier. Furthermore, it is found that the Random Forest and SVM techniques are the two most accurate algorithms and that combining the textual, syntactical and content-specific features results in a more accurate classifier. However, the effectiveness of combining these three classifiers is largely dependent on the combination method: the algorithm-based meta classifier yields the largest improvement over the individual feature type classifiers.
Description
Citation
Faculty
Faculteit der Letteren