Skip to main navigation Skip to search Skip to main content

PrecisionProbe: Non-intrusive Performance Analysis Tool for Deep Learning Recommendation Models

  • Beihang University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Deep learning recommendation models (DLRM) exploit user behaviors such as clicks, browse footprints, preferences, etc. for improved personalized experiences. However, in the face of the exponential growth of user data, such models require increasing GPU resources that are unaffordable and insufficient in a computing cluster. To improve GPU utilization and facilitate the advances of GPU scheduling algorithms, we present PrecisionProbe, a non-intrusive monitoring and analysis tool that can run upon Kubernetes and conduct sophisticated analytics of GPU resource utilization without altering the existing training code. PrecisionProbe captures fine-grained GPU metrics at the level of individual model layers and allows for a precise understanding of resource consumption patterns by exploring such detailed metrics. The mechanism is crucial for devising effective GPU scheduling algorithms, particularly tailored for DLRM training jobs dependent upon consumption patterns. Experimental results show that the recommendation models, as opposed to CV and NLP models, utilize less FP32 processing but have higher memory interaction frequencies. These findings indicate the unique resource needs of recommendation systems and necessitate the need of performance analytic using PrecisionProbe.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE International Conference on Joint Cloud Computing, JCC 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages17-20
Number of pages4
ISBN (Electronic)9798350387339
DOIs
StatePublished - 2024
Event15th IEEE International Conference on Joint Cloud Computing, JCC 2024 - Shanghai, China
Duration: 17 Jul 202418 Jul 2024

Publication series

NameProceedings - 2024 IEEE International Conference on Joint Cloud Computing, JCC 2024

Conference

Conference15th IEEE International Conference on Joint Cloud Computing, JCC 2024
Country/TerritoryChina
CityShanghai
Period17/07/2418/07/24

Keywords

  • Cloud Computing
  • Deep Recommendation Training
  • Kubernetes
  • Performance Analysis

Fingerprint

Dive into the research topics of 'PrecisionProbe: Non-intrusive Performance Analysis Tool for Deep Learning Recommendation Models'. Together they form a unique fingerprint.

Cite this