Frequent Subgraph Discovery in Large Attributed Streaming Graphs

Abhik Ray; Larry Holder; Sutanay Choudhury

Frequent Subgraph Discovery in Large Attributed Streaming Graphs

Abhik Ray, Larry Holder, Sutanay Choudhury

Proceedings of the 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, PMLR 36:166-181, 2014.

Abstract

The problem of finding frequent subgraphs in large dynamic graphs has so far only considered a dynamic graph as being represented by a series of static snapshots taken at various points in time. This representation of a dynamic graph does not lend itself well to real time processing of real world graphs like social networks or internet traffic which consist of a stream of nodes and edges. In this paper we propose an algorithm that discovers the frequent subgraphs present in a graph represented by a stream of labeled nodes and edges. Our algorithm is efficient and is easily tuned by the user to produce interesting patterns from various kinds of graph data. In our model, updates to the graph arrive in the form of batches which contain new nodes and edges. Our algorithm continuously reports the frequent subgraphs that are estimated to be found in the entire graph as each batch arrives. We evaluate our system using five large dynamic graph datasets: the Hetrec 2011 challenge data, Twitter, DBLP and two synthetic. We evaluate our approach against two popular large graph miners, i.e., SUBDUE and GERM. Our experimental results show that we can find the same frequent subgraphs as a non-incremental approach applied to snapshot graphs, and in less time.

Cite this Paper

BibTeX


@InProceedings{pmlr-v36-ray14,
  title = 	 {Frequent Subgraph Discovery in Large Attributed Streaming Graphs},
  author = 	 {Ray, Abhik and Holder, Larry and Choudhury, Sutanay},
  booktitle = 	 {Proceedings of the 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications},
  pages = 	 {166--181},
  year = 	 {2014},
  editor = 	 {Fan, Wei and Bifet, Albert and Yang, Qiang and Yu, Philip S.},
  volume = 	 {36},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {New York, New York, USA},
  month = 	 {24 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v36/ray14.pdf},
  url = 	 {https://proceedings.mlr.press/v36/ray14.html},
  abstract = 	 {The problem of finding frequent subgraphs in large dynamic graphs has so far only considered a dynamic graph as being represented by a series of static snapshots taken at various points in time. This representation of a dynamic graph does not lend itself well to real time processing of real world graphs like social networks or internet traffic which consist of a stream of nodes and edges. In this paper we propose an algorithm that discovers the frequent subgraphs present in a graph represented by a stream of labeled nodes and edges. Our algorithm is efficient and is easily tuned by the user to produce interesting patterns from various kinds of graph data. In our model, updates to the graph arrive in the form of batches which contain new nodes and edges. Our algorithm continuously reports the frequent subgraphs that are estimated to be found in the entire graph as each batch arrives. We evaluate our system using five large dynamic graph datasets: the Hetrec 2011 challenge data, Twitter, DBLP and two synthetic. We evaluate our approach against two popular large graph miners, i.e., SUBDUE and GERM. Our experimental results show that we can find the same frequent subgraphs as a non-incremental approach applied to snapshot graphs, and in less time.}
}

Endnote

%0 Conference Paper
%T Frequent Subgraph Discovery in Large Attributed Streaming Graphs
%A Abhik Ray
%A Larry Holder
%A Sutanay Choudhury
%B Proceedings of the 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
%C Proceedings of Machine Learning Research
%D 2014
%E Wei Fan
%E Albert Bifet
%E Qiang Yang
%E Philip S. Yu	
%F pmlr-v36-ray14
%I PMLR
%P 166--181
%U https://proceedings.mlr.press/v36/ray14.html
%V 36
%X The problem of finding frequent subgraphs in large dynamic graphs has so far only considered a dynamic graph as being represented by a series of static snapshots taken at various points in time. This representation of a dynamic graph does not lend itself well to real time processing of real world graphs like social networks or internet traffic which consist of a stream of nodes and edges. In this paper we propose an algorithm that discovers the frequent subgraphs present in a graph represented by a stream of labeled nodes and edges. Our algorithm is efficient and is easily tuned by the user to produce interesting patterns from various kinds of graph data. In our model, updates to the graph arrive in the form of batches which contain new nodes and edges. Our algorithm continuously reports the frequent subgraphs that are estimated to be found in the entire graph as each batch arrives. We evaluate our system using five large dynamic graph datasets: the Hetrec 2011 challenge data, Twitter, DBLP and two synthetic. We evaluate our approach against two popular large graph miners, i.e., SUBDUE and GERM. Our experimental results show that we can find the same frequent subgraphs as a non-incremental approach applied to snapshot graphs, and in less time.

RIS


TY  - CPAPER
TI  - Frequent Subgraph Discovery in Large Attributed Streaming Graphs
AU  - Abhik Ray
AU  - Larry Holder
AU  - Sutanay Choudhury
BT  - Proceedings of the 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
DA  - 2014/08/13
ED  - Wei Fan
ED  - Albert Bifet
ED  - Qiang Yang
ED  - Philip S. Yu	
ID  - pmlr-v36-ray14
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 36
SP  - 166
EP  - 181
L1  - http://proceedings.mlr.press/v36/ray14.pdf
UR  - https://proceedings.mlr.press/v36/ray14.html
AB  - The problem of finding frequent subgraphs in large dynamic graphs has so far only considered a dynamic graph as being represented by a series of static snapshots taken at various points in time. This representation of a dynamic graph does not lend itself well to real time processing of real world graphs like social networks or internet traffic which consist of a stream of nodes and edges. In this paper we propose an algorithm that discovers the frequent subgraphs present in a graph represented by a stream of labeled nodes and edges. Our algorithm is efficient and is easily tuned by the user to produce interesting patterns from various kinds of graph data. In our model, updates to the graph arrive in the form of batches which contain new nodes and edges. Our algorithm continuously reports the frequent subgraphs that are estimated to be found in the entire graph as each batch arrives. We evaluate our system using five large dynamic graph datasets: the Hetrec 2011 challenge data, Twitter, DBLP and two synthetic. We evaluate our approach against two popular large graph miners, i.e., SUBDUE and GERM. Our experimental results show that we can find the same frequent subgraphs as a non-incremental approach applied to snapshot graphs, and in less time.
ER  -

APA


Ray, A., Holder, L. & Choudhury, S.. (2014). Frequent Subgraph Discovery in Large Attributed Streaming Graphs. Proceedings of the 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, in Proceedings of Machine Learning Research 36:166-181 Available from https://proceedings.mlr.press/v36/ray14.html.

Related Material

Download PDF