[edit]
Collaborative Filtering Ensemble for Ranking
Proceedings of KDD Cup 2011, PMLR 18:153-167, 2012.
Abstract
This paper provides the solution of the team “commendo” on the Track2 dataset of the KDD Cup 2011 Dror et al.. Yahoo Labs provides a snapshot of their music-rating database as dataset for the competition, consisting of approximately 62 million ratings from 250k users on 300k items. The dataset includes hierachical information about the items. The goal of the competition is to distinguish beteen “High rated” and “Not rated” items of a user. The rating scale is discrete and ranges from 0 to 100, while a “High” rating is a rating$\geq 0$. The error measure is the percent of false rated tracks over all users, known as the fractions of misclassifications. The task is to minimize this error rate, hence the ranking should be optimized. Our final submission is a blend of different collaborative filtering algorithms enhanced, with basic statistics. The algorithms are trained consecutively and they are blended together with a neural network. Each of the algorithms optimizes a rank error measure.