TY - GEN
T1 - MemGaze
T2 - 2022 IEEE International Conference on Cluster Computing, CLUSTER 2022
AU - Kilic, Ozgur O.
AU - Tallent, Nathan R.
AU - Suriyakumar, Yasodha
AU - Xie, Chenhao
AU - Marquez, Andres
AU - Eranian, Stephane
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - A challenge of memory trace analysis is combining detailed analysis and low overhead measurement. Currently, hardware/software-based analysis of load-level sequences easily incurs time slowdowns of 100x. We present MemGaze, a tool for low-overhead, high-resolution memory trace analysis. MemGaze uses Intel's Processor Tracing (PT) instruction ptwrite to collect sampled and compressed memory address traces for load-level, sequence-aware analysis of data reuse. We describe multi-resolution analysis for locations vs. operations, accesses vs. spatio-temporal reuse, and reuse (distance, rate, volume) vs. access patterns. Both trace size and resolution are controllable. We use MemGaze to elucidate the memory effects of different data structures and algorithms. For sampled traces that are ˜ 1 % of a full one, analysis metrics have 1-25% MAPE for histograms of varying dynamic sequence lengths. With current suboptimal kernel support (PT runs continuously), MemGaze's time overhead is typically 10-95%; 7x at worst. However, when PT runs only during samples, overhead is 10-35 % on memory intensive regions and correlates with executed ptwrites.
AB - A challenge of memory trace analysis is combining detailed analysis and low overhead measurement. Currently, hardware/software-based analysis of load-level sequences easily incurs time slowdowns of 100x. We present MemGaze, a tool for low-overhead, high-resolution memory trace analysis. MemGaze uses Intel's Processor Tracing (PT) instruction ptwrite to collect sampled and compressed memory address traces for load-level, sequence-aware analysis of data reuse. We describe multi-resolution analysis for locations vs. operations, accesses vs. spatio-temporal reuse, and reuse (distance, rate, volume) vs. access patterns. Both trace size and resolution are controllable. We use MemGaze to elucidate the memory effects of different data structures and algorithms. For sampled traces that are ˜ 1 % of a full one, analysis metrics have 1-25% MAPE for histograms of varying dynamic sequence lengths. With current suboptimal kernel support (PT runs continuously), MemGaze's time overhead is typically 10-95%; 7x at worst. However, when PT runs only during samples, overhead is 10-35 % on memory intensive regions and correlates with executed ptwrites.
KW - MemGaze
KW - footprint
KW - memory access patterns
KW - memory access tracing
KW - processor tracing
KW - spatio-temporal reuse
UR - https://www.scopus.com/pages/publications/85140880209
U2 - 10.1109/CLUSTER51413.2022.00058
DO - 10.1109/CLUSTER51413.2022.00058
M3 - 会议稿件
AN - SCOPUS:85140880209
T3 - Proceedings - IEEE International Conference on Cluster Computing, ICCC
SP - 484
EP - 495
BT - Proceedings - 2022 IEEE International Conference on Cluster Computing, CLUSTER 2022
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 6 September 2022 through 9 September 2022
ER -