[edit]
More Is Better: Large Scale Partially-supervised Sentiment Classification
Proceedings of the Asian Conference on Machine Learning, PMLR 25:175-190, 2012.
Abstract
We describe a bootstrapping algorithm to learn from partially labeled data, and the results of an empirical study for using it to improve performance of sentiment classification using up to 15 million unlabeled Amazon product reviews. Our experiments cover semi-supervised learning, domain adaptation and weakly supervised learning. In some cases our methods were able to reduce test error by more than half using such large amount of data.