Gbase: An efficient analysis platform for large graphs

U. Kang, Hanghang Tong, Jimeng Sun, Ching Yung Lin, Christos Faloutsos

Research output: Contribution to journalArticlepeer-review

45 Scopus citations

Abstract

Graphs appear in numerous applications including cyber security, the Internet, social networks, protein networks, recommendation systems, citation networks, and many more. Graphs with millions or even billions of nodes and edges are common-place. How to store such large graphs efficiently? What are the core operations/queries on those graph? How to answer the graph queries quickly? We propose Gbase, an efficient analysis platform for large graphs. The key novelties lie in (1) our storage and compression scheme for a parallel, distributed settings and (2) the carefully chosen graph operations and their efficient implementations. We designed and implemented an instance of Gbase using Mapreduce/Hadoop. Gbase provides a parallel indexing mechanism for graph operations that both saves storage space, as well as accelerates query responses. We run numerous experiments on real and synthetic graphs, spanning billions of nodes and edges, and we show that our proposed Gbase is indeed fast, scalable, and nimble, with significant savings in space and time.

Original languageEnglish (US)
Pages (from-to)637-650
Number of pages14
JournalVLDB Journal
Volume21
Issue number5
DOIs
StatePublished - Oct 1 2012

Keywords

  • Compression
  • Distributed computing
  • Graph
  • Indexing

ASJC Scopus subject areas

  • Information Systems
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Gbase: An efficient analysis platform for large graphs'. Together they form a unique fingerprint.

Cite this