Subset Infinite Relational Models
; Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, PMLR 22:547-555, 2012.
We propose a new probabilistic generative model for analyzing sparse and noisy pairwise relational data, such as friend-links on SNSs and customer records in online shops. Real-world relational data often include a large portion of non-informative pairwise data entries. Many existing stochastic blockmodels suffer from these irrelevant data entries because of their rather simpler forms of priors. The proposed model newly incorporates a latent variable that explicitly indicates whether each data entry is relevant or not to diminish the bad effects associated with such irrelevant data. Through experimental results using synthetic and real data sets, we show that the proposed model can extract clusters with stronger relations among data within the cluster than clusters obtained by the conventional model.