跳到主要导航 跳到搜索 跳到主要内容

Research on sampling method of CFSFDP clustering algorithm and its criteria for determining the best sample size

  • Beihang University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Clustering by fast search and find of density peaks (CFSFDP) is a novel density-based fast clustering method, which has been widely studied and applied in many fields. However, when the sample size of data is too large, the algorithm is inefficient, since it consumes a lot of time and storage space. To solve the above problem, a simple random sampling (SRS) method is provided to speed up the optimized CFSFDP algorithm for real data with large sample size. The rate of correct classification of the sample is defined to measure its clustering performance, and we call it as sampling accuracy. We first use SRS method to generate small samples for cluster analysis. Then, we explore the relationship between the sampling rate and the sampling accuracy. Finally, in order to determine the best sample size that can achieve high sampling accuracy with high efficiency, the mean and standard deviation of the sampling accuracy are adopted as two criteria, and the best sample size is determined based on them. A real case study is given to show the implementation and effectiveness of the proposed method.

源语言英语
主期刊名ICAAI 2018 - 2018 the 2nd International Conference on Advances in Artificial Intelligence
出版商Association for Computing Machinery
24-28
页数5
ISBN(电子版)9781450365833
DOI
出版状态已出版 - 6 10月 2018
活动2nd International Conference on Advances in Artificial Intelligence, ICAAI 2018 - Barcelona, 西班牙
期限: 6 10月 20188 10月 2018

出版系列

姓名ACM International Conference Proceeding Series

会议

会议2nd International Conference on Advances in Artificial Intelligence, ICAAI 2018
国家/地区西班牙
Barcelona
时期6/10/188/10/18

指纹

探究 'Research on sampling method of CFSFDP clustering algorithm and its criteria for determining the best sample size' 的科研主题。它们共同构成独一无二的指纹。

引用此