TY - GEN
T1 - ECLIPSE
T2 - 34th ACM International Conference on Information and Knowledge Management, CIKM 2025
AU - Zhang, Wei
AU - Cheng, Xianfu
AU - Li, Xiang
AU - Yang, Jian
AU - Zhang, Liying
AU - Guan, Xiangyuan
AU - Li, Zhoujun
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s).
PY - 2025/11/10
Y1 - 2025/11/10
N2 - Log parsing is essential in software engineering but is challenged by the immense complexity of log templates and diverse cross-platform and cross-lingual log semantics and structures in industrial logs. We propose ECLIPSE, an Efficient Cross-platform and Cross-lingual Log Intelligent Parsing framework with Semantic Entropy-Enhanced Longest Common Subsequence algorithm in industrial Environments. ECLIPSE leverages large language models to extract log keywords and maintains a dynamic dictionary mapping these keywords to log templates. When parsing, it retrieves candidate templates based on the keywords and log length. We design an algorithm named Semantic Entropy-Enhanced Longest Common Subsequence (Entropy-ELCS) for identifying the best template, improving token-level accuracy by incorporating information entropy and semantic elements into the longest common subsequence algorithm. The dictionary is updated with new keywords and templates for continuous improvement. Experiments on public benchmarks and our industrial log parsing benchmark ECLIPSE-BENCH demonstrate that ECLIPSE achieves strong performance and superior efficiency, especially when handling large template sets.
AB - Log parsing is essential in software engineering but is challenged by the immense complexity of log templates and diverse cross-platform and cross-lingual log semantics and structures in industrial logs. We propose ECLIPSE, an Efficient Cross-platform and Cross-lingual Log Intelligent Parsing framework with Semantic Entropy-Enhanced Longest Common Subsequence algorithm in industrial Environments. ECLIPSE leverages large language models to extract log keywords and maintains a dynamic dictionary mapping these keywords to log templates. When parsing, it retrieves candidate templates based on the keywords and log length. We design an algorithm named Semantic Entropy-Enhanced Longest Common Subsequence (Entropy-ELCS) for identifying the best template, improving token-level accuracy by incorporating information entropy and semantic elements into the longest common subsequence algorithm. The dictionary is updated with new keywords and templates for continuous improvement. Experiments on public benchmarks and our industrial log parsing benchmark ECLIPSE-BENCH demonstrate that ECLIPSE achieves strong performance and superior efficiency, especially when handling large template sets.
KW - ai for it operations
KW - cross-platform and cross-lingual
KW - industrial log parsing system
KW - information entropy
KW - large language model
UR - https://www.scopus.com/pages/publications/105023190482
U2 - 10.1145/3746252.3761231
DO - 10.1145/3746252.3761231
M3 - 会议稿件
AN - SCOPUS:105023190482
T3 - CIKM 2025 - Proceedings of the 34th ACM International Conference on Information and Knowledge Management
SP - 4191
EP - 4201
BT - CIKM 2025 - Proceedings of the 34th ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery, Inc
Y2 - 10 November 2025 through 14 November 2025
ER -