HawkesTopic: A Joint Model for Network Inference and Topic Modeling from Text-Based Cascades
Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:871-880, 2015.
Understanding the diffusion of information in social network and social media requires modeling the text diffusion process. In this work, we develop the HawkesTopic model (HTM) for analyzing text-based cascades, such as "retweeting a post" or "publishing a follow-up blog post". HTM combines Hawkes processes and topic modeling to simultaneously reason about the information diffusion pathways and the topics characterizing the observed textual information. We show how to jointly infer them with a mean-field variational inference algorithm and validate our approach on both synthetic and real-world data sets, including a news media dataset for modeling information diffusion, and an ArXiv publication dataset for modeling scientific influence. The results show that HTM is significantly more accurate than several baselines for both tasks.