Time- and Space-Efficiently Sketching Billion-Scale Attributed Networks

  • Wei Wu
  • , Shiqi Li
  • , Mi Jiang
  • , Chuan Luo
  • , Fangfang Li*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Attributed network embedding seeks to depict each network node via a compact, low-dimensional vector while effectively preserving the similarity between node pairs, which lays a strong foundation for a great many high-level network mining tasks. With the advent of the era of Big Data, the number of nodes and edges has reached billions in many real-world networks, which poses great computational and storage challenges to the existing methods. Although some algorithms have been developed to handle billion-scale networks, they often undergo accuracy degradation or tempo-spatial inefficiency owing to attribute information loss or substantial parameter learning. To this end, we propose a simple, time- and space-efficient billion-scale attributed network embedding algorithm called SketchBANE in this paper, which strikes an excellent balance between accuracy and efficiency by adopting sparse random projection with 1-bit quantization to sketch the iterative closed neighborhood and maintain the similarity among high-order nodes in a non-learning manner. The extensive experimental results indicate that our proposed SketchBANE algorithm competes favorably with the state-of-the-art approaches, while remarkably reducing runtime and space consumption. Also, the proposed SketchBANE algorithm exhibits good scalability and parallelization.

Original languageEnglish
Pages (from-to)966-978
Number of pages13
JournalIEEE Transactions on Knowledge and Data Engineering
Volume37
Issue number2
DOIs
StatePublished - 2025

Keywords

  • Billion-Scale attributed networks
  • network embedding
  • quantization
  • random projection

Fingerprint

Dive into the research topics of 'Time- and Space-Efficiently Sketching Billion-Scale Attributed Networks'. Together they form a unique fingerprint.

Cite this