Open Access System for Information Sharing

Login Library

 

Article
Cited 37 time in webofscience Cited 51 time in scopus
Metadata Downloads

DualSim: Parallel Subgraph Enumeration in a Massive Graph on a Single Machine SCIE SCOPUS

Title
DualSim: Parallel Subgraph Enumeration in a Massive Graph on a Single Machine
Authors
Hyeonji KimJuneYoung LeeSourav S BhowmickHan, W.-SLee, JSeongyun KoMoath Jarrah
Date Issued
2016-06
Publisher
ACM
Abstract
Subgraph enumeration is important for many applications such as subgraph frequencies, network motif discovery, graphlet kernel computation, and studying the evolution of social networks. Most earlier work on subgraph enumeration assumes that graphs are resident in memory, which results in serious scalability problems. Recently, efforts to enumerate all subgraphs in a large-scale graph have seemed to enjoy some success by partitioning the data graph and exploiting the distributed frameworks such as MapReduce and distributed graph engines. However, we notice that all existing distributed approaches have serious performance problems for sub graph enumeration due to the explosive number of partial results. In this paper, we design and implement a disk-based, single machine parallel subgraph enumeration solution called DuALSim that can handle massive graphs without maintaining exponential numbers of partial results. Specifically, we propose a novel concept of the dual approach for subgraph enumeration. The dual approach swaps the roles of the data graph and the query graph. Specifically, instead of fixing the matching order in the query and then matching data vertices, it fixes the data vertices by fixing a set of disk pages and then finds all subgraph matchings in these pages. This enables us to significantly reduce the number of disk reads. We conduct extensive experiments with various real-world graphs to systematically demonstrate the superiority of DuALSim over state-of-the-art distributed subgraph enumeration methods. DuALSim outperforms the state-of-the-art methods by up to orders of magnitude, while they fail for many queries due to explosive intermediate results.
URI
https://oasis.postech.ac.kr/handle/2014.oak/37296
DOI
10.1145/2882903.2915209
ISSN
0730-8078
Article Type
Article
Citation
In Proc. 42nd Int'l conf. on Management of Data (ACM SIGMOD 2016), page. 1231 - 1245, 2016-06
Files in This Item:
There are no files associated with this item.

qr_code

  • mendeley

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Views & Downloads

Browse