Empirically Evaluating Co-Training

Keywords

Loading...
Thumbnail Image

Issue Date

2009-06-02

Language

en

Document type

Journal Title

Journal ISSN

Volume Title

Publisher

Title

ISSN

Volume

Issue

Startpage

Endpage

DOI

Abstract

Co-training is a classification scheme needing only a small set of training instances for correct classification. The main question assessed in this thesis was how co-training performance varies with varying representativeness of the training data. 1280 co-training runs have been made, to test the generalization accuracy of co-training classification when using different selections of the training data. The results indicate that the availability of training data that are typical for their class or a distribution in the training data matching the a priori distribution of the corpus as a whole is a good condition for the generalization accuracy of co-training.

Description

Citation

Faculty

Faculteit der Sociale Wetenschappen