Efficient Shortest Paths on Massive Social Graphs

Xiaohan Zhao

Alessandra Sala

Haitao Zheng

Ben Y. Zhao

*
Proceedings of the 7th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom)
*

Paper Abstract

Analysis of large networks is a critical component of many of today's
application environments, including online
social networks, protein interactions in biological networks, and Internet
traffic analysis. The arrival of massive network graphs with hundreds of millions
of nodes, *e.g.* social graphs, presents a unique challenge to graph
analysis applications. Most of these applications rely on computing
distances between node pairs, which for large graphs can take minutes to
compute using traditional algorithms such as breadth-first-search (BFS).

In this paper, we study ways to enable scalable graph processing for today's
massive networks. We explore the design space of *graph coordinate
systems*, a new approach that accurately approximates node distances in
constant time by embedding graphs into coordinate spaces. We show that a
hyperbolic embedding produces relatively low distortion error, and propose
*Rigel*, a hyperbolic graph coordinate system that lends itself to
efficient parallelization across a compute cluster. Rigel produces
significantly more accurate results than prior systems, and is naturally
parallelizable across compute clusters, allowing it to provide accurate
results for graphs up to 43 million nodes. Finally, we show that Rigel's
functionality can be easily extended to locate (near-) shortest paths
between node pairs. After a one-time preprocessing cost, Rigel answers
node-distance queries in 10's of microseconds, and also produces shortest
path results up to 18 times faster than prior shortest-path systems with
similar levels of accuracy.