Learning Rules from Incomplete Examples via Implicit Mention Models
Proceedings of the Asian Conference on Machine Learning, PMLR 20:197-212, 2011.
We study the problem of learning general rules from concrete facts extracted from natural data sources such as the newspaper stories and medical histories. Natural data sources present two challenges to automated learning, namely, radical incompleteness and systematic bias. In this paper, we propose an approach that combines simultaneous learning of multiple predictive rules with differential scoring of evidence which adapts to a presumed model of data generation. Learning multiple predicates simultaneously mitigates the problem of radical incompleteness, while the differential scoring would help reduce the effects of systematic bias. We evaluate our approach empirically on both textual and non-textual sources. We further present a theoretical analysis that elucidates our approach and explains the empirical results.