CoGNN: Efficient Scheduling for Concurrent GNN Training on GPUs

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Graph neural networks (GNNs) suffer from low GPU utilization due to frequent memory accesses. Existing concurrent training mechanisms cannot be directly adapted to GNNs because they fail to consider the impact of input irregularity. This requires pre-profiling the memory footprint of concurrent tasks based on input dimensions to ensure successful co-location on GPU. Moreover, massive training tasks generated from scenarios such as hyper-parameter tuning require flexible scheduling strategies. To address these problems, we propose CoGNN that enables efficient management of GNN training tasks on GPUs. Specifically, the CoGNN organizes the tasks in a queue and estimates the memory consumption of each task based on cost functions at operator basis. In addition, the CoGNN implements scheduling policies to generate task groups, which are iteratively submitted for execution. The experiment results show that the CoGNN can achieve shorter completion and queuing time for training tasks from diverse GNN models.

Original languageEnglish
Title of host publicationProceedings of SC 2022
Subtitle of host publicationInternational Conference for High Performance Computing, Networking, Storage and Analysis
PublisherIEEE Computer Society
ISBN (Electronic)9781665454445
DOIs
StatePublished - 2022
Event2022 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2022 - Dallas, United States
Duration: 13 Nov 202218 Nov 2022

Publication series

NameInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC
Volume2022-November
ISSN (Print)2167-4329
ISSN (Electronic)2167-4337

Conference

Conference2022 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2022
Country/TerritoryUnited States
CityDallas
Period13/11/2218/11/22

Keywords

  • Concurrent Training
  • Estimation Model
  • GPU
  • Graph Neural Networks
  • Task Scheduling

Fingerprint

Dive into the research topics of 'CoGNN: Efficient Scheduling for Concurrent GNN Training on GPUs'. Together they form a unique fingerprint.

Cite this