TY - JOUR
T1 - SyncNOVA
T2 - an end-to-end fine-grained profiling tool oN lOck behaVior detection and critical section diAgnosis
AU - Feng, Wentao
AU - Shang, Shizhe
AU - Li, Pengfei
AU - Yang, Hailong
AU - Luan, Zhongzhi
AU - Qian, Depei
N1 - Publisher Copyright:
© The Author(s) 2025.
PY - 2025/4
Y1 - 2025/4
N2 - Synchronization performance issues related to lock such as too large critical section and improper lock usage, are inevitable in scientific computing. Even skilled programmers suffer from complicated reports of existing lock behavior profilers, not to mention scientists who are most of the scientific computing programmers. Besides, ARM-based supercomputers emerge on the top 500 list while ARM-supported lock behavior profiling tools haven’t got enough attention as they deserve. Based on an “one step for all” workflow including problem identification, problem analysis and solution generation, this paper presents an end-to-end and fine-grained lock behavior profiling tool, supporting both ARM and ×86 architecture. Specially, this paper introduces a priority function to quantify the priority of distinct solutions and users can adjust different weights of metrics. Compared to existing work using library interception and replacement or ×86-based analysis framework, fined-grained analysis, highly usable report, high portability and strong compatibility make it an efficient tool for scientific computing programmers to find and optimize lock related performance bugs.
AB - Synchronization performance issues related to lock such as too large critical section and improper lock usage, are inevitable in scientific computing. Even skilled programmers suffer from complicated reports of existing lock behavior profilers, not to mention scientists who are most of the scientific computing programmers. Besides, ARM-based supercomputers emerge on the top 500 list while ARM-supported lock behavior profiling tools haven’t got enough attention as they deserve. Based on an “one step for all” workflow including problem identification, problem analysis and solution generation, this paper presents an end-to-end and fine-grained lock behavior profiling tool, supporting both ARM and ×86 architecture. Specially, this paper introduces a priority function to quantify the priority of distinct solutions and users can adjust different weights of metrics. Compared to existing work using library interception and replacement or ×86-based analysis framework, fined-grained analysis, highly usable report, high portability and strong compatibility make it an efficient tool for scientific computing programmers to find and optimize lock related performance bugs.
KW - ARM
KW - Fine-grained analysis
KW - Lock behavior detection
KW - Synchronization performance bottlenecks
UR - https://www.scopus.com/pages/publications/105003461268
U2 - 10.1007/s42514-024-00210-1
DO - 10.1007/s42514-024-00210-1
M3 - 文章
AN - SCOPUS:105003461268
SN - 2524-4922
VL - 7
SP - 100
EP - 113
JO - CCF Transactions on High Performance Computing
JF - CCF Transactions on High Performance Computing
IS - 2
ER -