Empirically Evaluating Co-Training

Keywords
Loading...
Thumbnail Image
Issue Date
2009-06-02
Language
en
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Co-training is a classification scheme needing only a small set of training instances for correct classification. The main question assessed in this thesis was how co-training performance varies with varying representativeness of the training data. 1280 co-training runs have been made, to test the generalization accuracy of co-training classification when using different selections of the training data. The results indicate that the availability of training data that are typical for their class or a distribution in the training data matching the a priori distribution of the corpus as a whole is a good condition for the generalization accuracy of co-training.
Description
Citation
Faculty
Faculteit der Sociale Wetenschappen