Options
A scalable and generic framework to mine top-k representative subgraph patterns
Date Issued
31-01-2017
Author(s)
Natarajan, Dheepikaa
Ranu, Sayan
Abstract
Mining subgraph patterns is an active area of research. Existing research has primarily focused on mining all subgraph patterns in the database. However, due to the exponential subgraph search space, the number of patterns mined, typically, is too large for any human mediated analysis. Consequently, deriving insights from the mined patterns is hard for domain scientists. In addition, subgraph pattern mining is posed in multiple forms: The function that models if a subgraph is a pattern varies based on the application and the database could be over multiple graphs or a single, large graph. In this paper, we ask the following question: Given a subgraph importance function and a budget k, which are the k subgraph patterns that best represent all other patterns of interest? Weshow that the problem is NP-hard, and propose a generic framework called RESLING that adapts to arbitrary subgraphimportance functions and generalizable to both transactional graph databases as well as single, large graphs. Experimentsshow that RESLING is up to 20 times more representative of the pattern space and 2 orders of magnitude faster than the state-of-The-Art techniques. The executables for the tool are available at http://www.cse.iitm.ac.in/~ayan/software.HTML.