Copy or Coincidence? A Model for Detecting Social Influence and Duplication Events

Lisa Friedland, David Jensen, Michael Lavine
Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):1175-1183, 2013.

Abstract

In this paper, we analyze the task of inferring rare links between pairs of entities that seem too similar to have occurred by chance. Variations of this task appear in such diverse areas as social network analysis, security, fraud detection, and entity resolution. To address the task in a general form, we propose a simple, flexible mixture model in which most entities are generated independently from a distribution but a small number of pairs are constrained to be similar. We predict the true pairs using a likelihood ratio that trades off the entities’ similarity with their rarity. This method always outperforms using only similarity; however, with certain parameter settings, similarity turns out to be surprisingly competitive. Using real data, we apply the model to detect twins given their birth weights and to re-identify cell phone users based on distinctive usage patterns.

Cite this Paper


BibTeX
@InProceedings{pmlr-v28-friedland13, title = {Copy or Coincidence? A Model for Detecting Social Influence and Duplication Events}, author = {Friedland, Lisa and Jensen, David and Lavine, Michael}, booktitle = {Proceedings of the 30th International Conference on Machine Learning}, pages = {1175--1183}, year = {2013}, editor = {Dasgupta, Sanjoy and McAllester, David}, volume = {28}, number = {3}, series = {Proceedings of Machine Learning Research}, address = {Atlanta, Georgia, USA}, month = {17--19 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v28/friedland13.pdf}, url = {https://proceedings.mlr.press/v28/friedland13.html}, abstract = {In this paper, we analyze the task of inferring rare links between pairs of entities that seem too similar to have occurred by chance. Variations of this task appear in such diverse areas as social network analysis, security, fraud detection, and entity resolution. To address the task in a general form, we propose a simple, flexible mixture model in which most entities are generated independently from a distribution but a small number of pairs are constrained to be similar. We predict the true pairs using a likelihood ratio that trades off the entities’ similarity with their rarity. This method always outperforms using only similarity; however, with certain parameter settings, similarity turns out to be surprisingly competitive. Using real data, we apply the model to detect twins given their birth weights and to re-identify cell phone users based on distinctive usage patterns.} }
Endnote
%0 Conference Paper %T Copy or Coincidence? A Model for Detecting Social Influence and Duplication Events %A Lisa Friedland %A David Jensen %A Michael Lavine %B Proceedings of the 30th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2013 %E Sanjoy Dasgupta %E David McAllester %F pmlr-v28-friedland13 %I PMLR %P 1175--1183 %U https://proceedings.mlr.press/v28/friedland13.html %V 28 %N 3 %X In this paper, we analyze the task of inferring rare links between pairs of entities that seem too similar to have occurred by chance. Variations of this task appear in such diverse areas as social network analysis, security, fraud detection, and entity resolution. To address the task in a general form, we propose a simple, flexible mixture model in which most entities are generated independently from a distribution but a small number of pairs are constrained to be similar. We predict the true pairs using a likelihood ratio that trades off the entities’ similarity with their rarity. This method always outperforms using only similarity; however, with certain parameter settings, similarity turns out to be surprisingly competitive. Using real data, we apply the model to detect twins given their birth weights and to re-identify cell phone users based on distinctive usage patterns.
RIS
TY - CPAPER TI - Copy or Coincidence? A Model for Detecting Social Influence and Duplication Events AU - Lisa Friedland AU - David Jensen AU - Michael Lavine BT - Proceedings of the 30th International Conference on Machine Learning DA - 2013/05/26 ED - Sanjoy Dasgupta ED - David McAllester ID - pmlr-v28-friedland13 PB - PMLR DP - Proceedings of Machine Learning Research VL - 28 IS - 3 SP - 1175 EP - 1183 L1 - http://proceedings.mlr.press/v28/friedland13.pdf UR - https://proceedings.mlr.press/v28/friedland13.html AB - In this paper, we analyze the task of inferring rare links between pairs of entities that seem too similar to have occurred by chance. Variations of this task appear in such diverse areas as social network analysis, security, fraud detection, and entity resolution. To address the task in a general form, we propose a simple, flexible mixture model in which most entities are generated independently from a distribution but a small number of pairs are constrained to be similar. We predict the true pairs using a likelihood ratio that trades off the entities’ similarity with their rarity. This method always outperforms using only similarity; however, with certain parameter settings, similarity turns out to be surprisingly competitive. Using real data, we apply the model to detect twins given their birth weights and to re-identify cell phone users based on distinctive usage patterns. ER -
APA
Friedland, L., Jensen, D. & Lavine, M.. (2013). Copy or Coincidence? A Model for Detecting Social Influence and Duplication Events. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(3):1175-1183 Available from https://proceedings.mlr.press/v28/friedland13.html.

Related Material