Large Scale CVR Prediction through Dynamic Transfer Learning of Global and Local Features
; Proceedings of the 5th International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications at KDD 2016, PMLR 53:103-119, 2016.
This paper presents a combination of strategies for conversion rate (CVR) prediction de- ployed at the Yahoo! demand side platform (DSP) Brightroll, targeting at modeling extremely high dimensional, sparse data with limited human intervention. We propose a novel probabilistic generative model by tightly integrating components of natural language processing, dynamic transfer learning and scalable prediction, named Dynamic Transfer Learning with Reinforced Word Modeling (a.k.a. Trans-RWM ) to predict user conversion rates. Our model is based on assumptions that: on a higher level, information can be transferable between related campaigns; on a lower level, users who searched similar contents or browsed similar pages would have a higher probability of sharing similar latent purchase interests. Novelties of this framework include (i) A novel natural language modeling specifically tailored for semantic inputs of CVR prediction; (ii) A Bayesian transfer learning model to dynamically transfer the knowledge from source to the future target; (iii) An automatic new updating rule with adaptive regularization using Stochastic Gradient Monte Carlo to support the efficient updating of Trans-RWM in high-dimensional and sparse data. We demonstrate that on Brightroll our framework can effectively discriminate extremely rare events in terms of their conversion propensity.