TY - GEN
T1 - Paddy
T2 - 2020 IEEE/IFIP Network Operations and Management Symposium, NOMS 2020
AU - Huang, Shaohan
AU - Liu, Yi
AU - Fung, Carol
AU - He, Rong
AU - Zhao, Yining
AU - Yang, Hailong
AU - Luan, Zhongzhi
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/4
Y1 - 2020/4
N2 - Large enterprise systems often produce a large volume of event logs, and event log parsing is an important log management task. The goal of log parsing is to construct log templates from log messages and convert raw log messages into structured log messages. A log parser can help engineers monitor their systems and detect anomalous behaviors and errors. Most existing log parsing methods focus on offline methods, which require all log data to be available before parsing. In addition, the massive volume of log messages makes the process complex and time-consuming. In this paper, we propose Paddy, an online event log parsing method. Paddy uses a dynamic dictionary structure to build an inverted index, which can search the template candidates efficiently with a high rate of recall. The use of Jaccard similarity and length feature to rank candidates can improve parsing precision. We evaluated our proposed method on 16 real log datasets from various sources including distributed systems, supercomputers, operating systems, mobile systems, and standalone software. Our experimental results demonstrate that Paddy achieves the highest accuracy on eight data sets out of sixteen datasets compared to other baseline methods. We also evaluated the robustness and runtime efficiency of the methods and the experimental results show that our method Paddy achieves superior stableness and is scalable with a large volume of log messages.
AB - Large enterprise systems often produce a large volume of event logs, and event log parsing is an important log management task. The goal of log parsing is to construct log templates from log messages and convert raw log messages into structured log messages. A log parser can help engineers monitor their systems and detect anomalous behaviors and errors. Most existing log parsing methods focus on offline methods, which require all log data to be available before parsing. In addition, the massive volume of log messages makes the process complex and time-consuming. In this paper, we propose Paddy, an online event log parsing method. Paddy uses a dynamic dictionary structure to build an inverted index, which can search the template candidates efficiently with a high rate of recall. The use of Jaccard similarity and length feature to rank candidates can improve parsing precision. We evaluated our proposed method on 16 real log datasets from various sources including distributed systems, supercomputers, operating systems, mobile systems, and standalone software. Our experimental results demonstrate that Paddy achieves the highest accuracy on eight data sets out of sixteen datasets compared to other baseline methods. We also evaluated the robustness and runtime efficiency of the methods and the experimental results show that our method Paddy achieves superior stableness and is scalable with a large volume of log messages.
KW - Dynamic Dictionary
KW - Log Parsing
KW - Log analysis
UR - https://www.scopus.com/pages/publications/85086758624
U2 - 10.1109/NOMS47738.2020.9110435
DO - 10.1109/NOMS47738.2020.9110435
M3 - 会议稿件
AN - SCOPUS:85086758624
T3 - Proceedings of IEEE/IFIP Network Operations and Management Symposium 2020: Management in the Age of Softwarization and Artificial Intelligence, NOMS 2020
BT - Proceedings of IEEE/IFIP Network Operations and Management Symposium 2020
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 20 April 2020 through 24 April 2020
ER -