Diffusion Source Identification on Networks with Statistical Confidence

Quinlan E Dawkins, Tianxi Li, Haifeng Xu
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:2500-2509, 2021.

Abstract

Diffusion source identification on networks is a problem of fundamental importance in a broad class of applications, including controlling the spreading of rumors on social media, identifying a computer virus over cyber networks, or identifying the disease center during epidemiology. Though this problem has received significant recent attention, most known approaches are well-studied in only very restrictive settings and lack theoretical guarantees for more realistic networks. We introduce a statistical framework for the study of this problem and develop a confidence set inference approach inspired by hypothesis testing. Our method efficiently produces a small subset of nodes, which provably covers the source node with any pre-specified confidence level without restrictive assumptions on network structures. To our knowledge, this is the first diffusion source identification method with a practically useful theoretical guarantee on general networks. We demonstrate our approach via extensive synthetic experiments on well-known random network models, a large data set of real-world networks as well as a mobility network between cities concerning the COVID-19 spreading in January 2020.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-dawkins21a, title = {Diffusion Source Identification on Networks with Statistical Confidence}, author = {Dawkins, Quinlan E and Li, Tianxi and Xu, Haifeng}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {2500--2509}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/dawkins21a/dawkins21a.pdf}, url = {https://proceedings.mlr.press/v139/dawkins21a.html}, abstract = {Diffusion source identification on networks is a problem of fundamental importance in a broad class of applications, including controlling the spreading of rumors on social media, identifying a computer virus over cyber networks, or identifying the disease center during epidemiology. Though this problem has received significant recent attention, most known approaches are well-studied in only very restrictive settings and lack theoretical guarantees for more realistic networks. We introduce a statistical framework for the study of this problem and develop a confidence set inference approach inspired by hypothesis testing. Our method efficiently produces a small subset of nodes, which provably covers the source node with any pre-specified confidence level without restrictive assumptions on network structures. To our knowledge, this is the first diffusion source identification method with a practically useful theoretical guarantee on general networks. We demonstrate our approach via extensive synthetic experiments on well-known random network models, a large data set of real-world networks as well as a mobility network between cities concerning the COVID-19 spreading in January 2020.} }
Endnote
%0 Conference Paper %T Diffusion Source Identification on Networks with Statistical Confidence %A Quinlan E Dawkins %A Tianxi Li %A Haifeng Xu %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-dawkins21a %I PMLR %P 2500--2509 %U https://proceedings.mlr.press/v139/dawkins21a.html %V 139 %X Diffusion source identification on networks is a problem of fundamental importance in a broad class of applications, including controlling the spreading of rumors on social media, identifying a computer virus over cyber networks, or identifying the disease center during epidemiology. Though this problem has received significant recent attention, most known approaches are well-studied in only very restrictive settings and lack theoretical guarantees for more realistic networks. We introduce a statistical framework for the study of this problem and develop a confidence set inference approach inspired by hypothesis testing. Our method efficiently produces a small subset of nodes, which provably covers the source node with any pre-specified confidence level without restrictive assumptions on network structures. To our knowledge, this is the first diffusion source identification method with a practically useful theoretical guarantee on general networks. We demonstrate our approach via extensive synthetic experiments on well-known random network models, a large data set of real-world networks as well as a mobility network between cities concerning the COVID-19 spreading in January 2020.
APA
Dawkins, Q.E., Li, T. & Xu, H.. (2021). Diffusion Source Identification on Networks with Statistical Confidence. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:2500-2509 Available from https://proceedings.mlr.press/v139/dawkins21a.html.

Related Material