Skip to main navigation Skip to search Skip to main content

SimTrace: Exploiting Spatial and Temporal Sampling for Large-Scale Performance Analysis

  • Beihang University

Research output: Contribution to journalArticlepeer-review

Abstract

MPI tracing tools is essential to collect the communication events and performance metrics of large-scale programs for further performance analysis and optimization. However, toward the exascale era, the performance and storage overhead for tracing becomes extremely prohibitive that significantly disturbs the original execution of MPI programs, leading to distorted tracing data and thus mislead analysis results. Although process sampling can effectively reduce the tracing overhead, it can easily miss important execution information that is necessary for subsequent performance analysis. In this article, we propose SimTrace, a scalable MPI tracing tool with novel spatial and temporal sampling strategies that exploits the similarity among MPI processes to achieve both low tracing overhead as well as obtain sufficient tracing information. The experimental results demonstrate that SimTrace can significantly reduce the MPI tracing overhead compared to the state-of-the-art tracing tools, meanwhile enabling effective analysis to guide performance optimization of large-scale programs.

Original languageEnglish
Article number55
JournalACM Transactions on Architecture and Code Optimization
Volume22
Issue number2
DOIs
StatePublished - 30 Jun 2025

Keywords

  • Large-scale
  • MPI tracing
  • performance analysis
  • sampling

Fingerprint

Dive into the research topics of 'SimTrace: Exploiting Spatial and Temporal Sampling for Large-Scale Performance Analysis'. Together they form a unique fingerprint.

Cite this