SubSift: a novel application of the vector space model to support the academic research process


Simon Price, Peter A. Flach, Sebastian Spiegler ;
Proceedings of the First Workshop on Applications of Pattern Analysis, PMLR 11:20-27, 2010.


SubSift matches submitted conference or journal papers to potential peer reviewers based on the similarity between the paper's abstract and the reviewer's publications as found in online bibliographic databases such as Google Scholar. Using concepts from information retrieval including a bag-of-words representation and cosine similarity, the SubSift tools were originally created to streamline the peer review process for the ACM SIGKDD'09 data mining conference. This paper describes how these tools were subsequently developed and deployed in the form of web services designed to support not only peer review but also personalised data discovery and mashups. SubSift has already been used by several major data mining conferences and interesting applications in other fields are now emerging.

Related Material