TY - GEN
T1 - Network traffic classification using K-means clustering
AU - Yingqiu, Liu
AU - Wei, Li
AU - Yunchun, Li
PY - 2007
Y1 - 2007
N2 - Network traffic classification and application identification provide important benefits for IP network engineering, management and control and other key domains. Current popular methods, such as port-based and payload-based, have shown some disadvantages, and the machine learning based method is a potential one. The traffic is classified according to the payload-independent statistical characters. This paper introduces the different levels in network traffic-analysis and the relevant knowledge in machine learning domain, analysis the problems of port-based and payload-based methods in traffic classification. Considering the priority of the machine learning-based method, we experiment with unsupervised K-means to evaluate the efficiency and performance. We adopt feature selection to find an optimal feature set and log transformation to improve the accuracy. The experimental results on different datasets convey that the method can obtain up to 80% overall accuracy, and, after a log transformation, the accuracy is improved to 90% or more.
AB - Network traffic classification and application identification provide important benefits for IP network engineering, management and control and other key domains. Current popular methods, such as port-based and payload-based, have shown some disadvantages, and the machine learning based method is a potential one. The traffic is classified according to the payload-independent statistical characters. This paper introduces the different levels in network traffic-analysis and the relevant knowledge in machine learning domain, analysis the problems of port-based and payload-based methods in traffic classification. Considering the priority of the machine learning-based method, we experiment with unsupervised K-means to evaluate the efficiency and performance. We adopt feature selection to find an optimal feature set and log transformation to improve the accuracy. The experimental results on different datasets convey that the method can obtain up to 80% overall accuracy, and, after a log transformation, the accuracy is improved to 90% or more.
UR - https://www.scopus.com/pages/publications/46449125940
U2 - 10.1109/IMSCCS.2007.4392626
DO - 10.1109/IMSCCS.2007.4392626
M3 - 会议稿件
AN - SCOPUS:46449125940
SN - 0769530397
SN - 9780769530390
T3 - Proceedings - 2nd International Multi-Symposiums on Computer and Computational Sciences, IMSCCS'07
SP - 360
EP - 365
BT - Proceedings - 2nd International Multi-Symposiums on Computer and Computational Sciences, IMSCCS'07
T2 - 2nd International Multi-Symposiums on Computer and Computational Sciences 2007, IMSCCS'07
Y2 - 13 August 2007 through 15 August 2007
ER -