Options
GraphIVE: Heterogeneity-Aware Adaptive Graph Partitioning in GraphLab
Date Issued
07-05-2015
Author(s)
Kumar, Dinesh
Raj, Arun
Patra, Deepankar
Indian Institute of Technology, Madras
Abstract
GraphLab, distributed graph-processing framework, has found multiple applications in data-mining. Its scalability makes it the perfect choice for running graph algorithms on large data. The current scheduler in GraphLab splits the graph based on various partitioning strategies. These strategies split the graph into approximately equal parts, which is suited for homogeneous clusters, but is liable to perform poorly in the presence of heterogeneity. A number of challenges arise when the nodes differ in memory and processing power. We show that memory in particular can be a severe bottleneck, even leading to the termination of certain jobs. We determine the extent to which the current scheduler can handle heterogeneity. We further propose GraphIVE (Graph Processing In Varied Environments), a capability-aware graph partitioning policy for GraphLab applications. Moreover, GraphIVE continously tries to reach optimum performance via hill climbing. We describe how GraphIVE reduces the communication overhead by reducing the replication factor of vertices. We implemented a prototype of GraphIVE and present the preliminary results. GraphIVE significantly improves the execution time of jobs. The results also show how it allows for seamless graph processing on a heterogeneous cluster.
Volume
2015-May