跳到主要导航 跳到搜索 跳到主要内容

Sequential pattern mining algorithm based on text data: Taking the fault text records as an example

科研成果: 期刊稿件文章同行评审

摘要

Sequential pattern mining (SPM) is an effective and important method for analyzing time series. This paper proposed a SPM algorithm to mine fault sequential patterns in text data. Because the structure of text data is poor and there are many different forms of text expression for the same concept, the traditional SPM algorithm cannot be directly applied to text data. The proposed algorithm is designed to solve this problem. First, this study measured the similarity of fault text data and classified similar faults into one class. Next, this paper proposed a new text similarity measurement model based on the word embedding distance. Compared with the classic text similarity measurement method, this model can achieve good results in short text classification. Then, on the basis of fault classification, this paper proposed the SPM algorithm with an event window, which is a time soft constraint for obtaining a certain number of sequential patterns according to needs. Finally, this study used the fault text records of a certain aircraft as experimental data for mining fault sequential patterns. Experiment showed that this algorithm can effectively mine sequential patterns in text data. The proposed algorithm can be widely applied to text time series data in many fields such as industry, business, finance and so on.

源语言英语
文章编号4330
期刊Sustainability (Switzerland)
10
11
DOI
出版状态已出版 - 2018

联合国可持续发展目标

此成果有助于实现下列可持续发展目标:

  1. 可持续发展目标 7 - 经济适用的清洁能源
    可持续发展目标 7 经济适用的清洁能源

指纹

探究 'Sequential pattern mining algorithm based on text data: Taking the fault text records as an example' 的科研主题。它们共同构成独一无二的指纹。

引用此