自适应密度峰值聚类算法

张强; 周水生; 张颖

doi:10.19665/j.issn1001-2400.20230604

您当前的位置：

首页 >

文章列表页 >

自适应密度峰值聚类算法

计算机科学与技术&网络空间安全 | 更新时间：2024-06-03

- 自适应密度峰值聚类算法
- Adaptivedensity peak clustering algorithm
- 西安电子科技大学学报 2024年51卷第2期页码：170-181
- 作者机构：
  
  西安电子科技大学数学与统计学院,陕西西安 710071
- 作者简介：
  
  [ "张强(1996—),男,西安电子科技大学硕士研究生,E-mail:[email protected]; " ]
  [ "周水生(1972—),男,教授,E-mail:[email protected]; " ]
  [ "张颖(1996—),男,西安电子科技大学博士研究生,E-mail:[email protected]" ]
- 基金信息：
  
  国家自然科学基金(61772020)
- DOI：10.19665/j.issn1001-2400.20230604
  中图分类号： TP391
- 纸质出版日期：2024-4-20，
  
  网络出版日期：2023-9-20，
  
  收稿日期：2023-4-8，
扫描看全文
张强, 周水生, 张颖. 自适应密度峰值聚类算法[J]. 西安电子科技大学学报, 2024,51(2):170-181.

Qiang ZHANG, Shuisheng ZHOU, Ying ZHANG. Adaptivedensity peak clustering algorithm[J]. Journal of Xidian University, 2024,51(2):170-181.
张强, 周水生, 张颖. 自适应密度峰值聚类算法[J]. 西安电子科技大学学报, 2024,51(2):170-181. DOI： 10.19665/j.issn1001-2400.20230604.

Qiang ZHANG, Shuisheng ZHOU, Ying ZHANG. Adaptivedensity peak clustering algorithm[J]. Journal of Xidian University, 2024,51(2):170-181. DOI： 10.19665/j.issn1001-2400.20230604.

摘要

密度峰值聚类(DPC)以其简单、高效的特点被广泛应用。然而

其有两个不足:① 集群密度不均匀和不平衡的数据集在DPC所提供的决策图中

很难识别真正的聚类中心;② 存在一个区域密度最高的点的错误分配将导致该区域内的所有点都指向同一个错误的聚类的“链式效应”。针对这两个不足

引入新的自然邻域(NaN)的概念

提出了一种基于自然邻域的密度峰值聚类算法(DPC-NaN)。算法使用新的自然邻域密度识别噪声点

选择初始预聚类中心点

将非噪声点按密度峰值方法进行分配以得到预聚类;并通过确定预聚类的边界点和合并半径

自适应地将预聚类结果合并为最终聚类。所提算法无需人工预设参数

也缓解了“链式效应”的问题。实验结果表明

与相关聚类算法相比

所提出的算法可在典型的数据集上获得更好的聚类结果

同时在图像分割表现良好。

Abstract

Density Peak Clustering(DPC) is widely used in many fields because of its simplicity and high efficiency.However

it has two disadvantages:① It is difficult to identify the real clustering center in the decision graph provided by DPC for data sets with an uneven cluster density and imbalance;② There exists a "chain effect" where a misallocation of the points with the highest density in a region will result in all points within the region pointing to the same false cluster.In view of these two deficiencies

a new concept of Natural Neighbor(NaN) is introduced

and a density peak clustering algorithm based on the natural neighbor(DPC-NaN) is proposed which uses the new natural neighborhood density to identify the noise points

selects the initial preclustering center point

and allocates the non-noise points according to the density peak method to get the preclustering.By determining the boundary points and merging radius of the preclustering

the results of the preclustering can be adaptively merged into the final clustering.The proposed algorithm eliminates the need for manual parameter presetting and alleviates the problem of "chain effect".Experimental results show that compared with the correlation clustering algorithm

the proposed algorithm can obtain better clustering results on typical data sets and perform well in image segmentation.

关键词

聚类密度峰值聚类自然邻域图像分割

Keywords

clusteringdensity peak clusteringnatural neighborimage segmentation

references

HAN J, KAMBER M. Data Mining:Concepts and Techniques[M]. San Francisco: Morgan Kaufmann, 2000:559-569.

WU W, PENG M. A Data Mining Approach Combining K-Means Clustering with Bagging Neural Network for Short-Term Wind Power Forecasting[J]. IEEE Internet of Things Journal, 2017, 4(4):979-986.

HOU J, LIU W, CUI H, et al. Towards Parameter-Independent Data Clustering and Image Segmentation[J]. Pattern Recognition, 2016, 60(C):25-36.

XU R, WUNSCH D. Survey of Clustering Algorithms[J]. IEEE Transactions on Neural Networks, 2005, 16(3):645-678.

MAU T N, INOGUCHI Y, HUYNH V N. A Novel Cluster Prediction ApproachBased on Locality-Sensitive Hashing for Fuzzy Clustering of Categorical Data[J]. IEEE Access, 2022,10:34196-34206.

秦宁宁, 张臣臣. 模糊聚类下的接入点选择匹配定位算法[J]. 西安电子科技大学学报, 2022, 49(4):71-81.

QIN Ningning, ZHANG Chenchen. Access Point Selection Matching Localization Algorithm Based on Fuzzy Clustering[J]. Journal of Xidian University, 2022, 49(4):71-81.

WANG Y, PANG W, ZHOU J. An Improved Density Peak Clustering Algorithm Guided by Pseudo Labels[J]. Knowledge-Based Systems, 2022,252:109374.

LEE J S, LEE H T, CHO I S. Maritime Traffic Route Detection FrameworkBased on Statistical Density Analysis from AIS Data Using a Clustering Algorithm[J]. IEEE Access, 2022,10:23355-23366.

MACQUEEN J. Some Methods for Classification and Analysis of Multivariate Observations[C]// Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. Berkeley: Berkeley Symposium on Mathematical Statistics and Probability, 1967:281-297.

JAIN A K. Data clustering:50 Years beyond K-means[J]. Pattern Recognition Letters, 2010, 31(8):651-666.

RODRIGUEZ A, LAIO A. Clustering by Fast Search and Find of Density Peaks[J]. Science, 2014, 344(6191):1492-1496.

DU M, DING S, JIA H. Study on Density Peaks Clustering Based on K-Nearest Neighbors and Principal Component Analysis[J]. Knowledge-Based Systems, 2016,99:135-145.

XIE J, GAO H, XIE W, et al. Robust Clustering by Detecting Density Peaks and Assigning Points Based on Fuzzy Weighted K-Nearest Neighbors[J]. Information Sciences, 2016,354:19-40.

LIU R, WANG H, YU X. Shared-Nearest-Neighbor-Based Clustering by Fast Search and Find of Density Peaks[J]. Information Sciences, 2018,450:200-226.

TONG W, LIU S, GAO X. A Density-Peak-Based Clustering Algorithm of Automatically Determining the Number of Clusters[J]. Neurocomputing, 2021, 458(8):655-666.

ZHANG Z, ZHU Q, ZHU F, et al. Density Decay Graph-Based Density Peak Clustering[J]. Knowledge-Based Systems, 2021, 224(4):107075.

GUO W, WANG W, ZHAO S, et al. Density Peak Clustering with Connectivity Estimation[J]. Knowledge-Based Systems, 2022,243:108501.

ZHU Qi, FENG J, HUANG J. Natural Neighbor:A Self-Adaptive Neighborhood Method without Parameter K[J]. Pattern Recognition Letters, 2016,80:30-36.

NIE Q, NIE Z. Natural NeighborGalerkin Method for Electromagnetic Field Analysis[C]//2022 Global Conference on Robotics,Artificial Intelligence and Information Technology. Piscataway:IEEE, 2022: 811-814.

XIONG J, ZANG W, CHE J, et al. Density Peaks Clustering Based on Natural Search Neighbors and Manifold Distance Metric[J]. IEEE Access, 2022,10:114642-114656.

FU L, MEDICO E. FLAME, A Novel Fuzzy Clustering Method for the Analysis of DNA Microarray Data[J]. BMC Bioinformatics, 2007, 8(1):1-15.

CHEN J, YU P S. A Domain Adaptive Density Clustering Algorithm for Data with Varying Density Distribution[J]. IEEE Transactions on Knowledge and Data Engineering, 2019, 33(6):2310-2321.

CHANG H, YEUNG D Y. Robust Path-Based Spectral Clustering[J]. Pattern Recognition, 2008, 41(1):191-203.

JAIN A K, LAW M. Data Clustering:A User’s Dilemma[C]// International Conference on Pattern Recognition & Machine Intelligence. Heidelberg:Springer, 2005:1-10.

YANG L, CHEUNG Y M, TANG Y. Self-AdaptiveMultiprototype-Based Competitive Learning Approach:A k-Means-Type Algorithm for Imbalanced Data Clustering[J]. IEEE Transactions on Cybernetics, 2019, 51(3):1598-1612.

FRANTI P, SIERANOJA S. k-Means Properties on Six Clustering Benchmark Datasets[J]. Applied Intelligence, 2018,48:4743-4759.

FREIRE A L, BARRETO G A, VELOSO M, et al. Short-Term Memory Mechanisms in Neural Network Learning of Robot Navigation Tasks:A Case Study[C]// Robotics Symposium. Piscataway:IEEE, 2009:1-6.

DUA D, GRAFF C. UCI Machine Learning Repository(2019)[DB/OL].[2019-01-01]. http://archive.ics.uci.edu/ml. http://archive.ics.uci.edu/mlhttp://archive.ics.uci.edu/ml

SEPTIARINI A, HAMDANI H, SARI S U, et al. Image Processing Techniques for Tomato Segmentation Applying k-Means Clustering and Edge Detection Approach[C]//2021 International Seminar on Machine Learning,Optimization,and Data Science. Piscataway:IEEE, 2021: 92-96.

KHILKHAL R, ISMAEL M. Brain Tumor Segmentation Utilizing Thresholding and k-Means Clustering[C]//2022 Muthanna International Conference on Engineering Science and Technology. Piscataway:IEEE, 2022: 43-48.

ZHANG H, PENG Q. PSO and k-Means-Based Semantic Segmentation toward Agricultural Products[J]. Future Generation Computer Systems, 2022,126:82-87.

张泽欢, 刘强, 国狄非. 面向大规模零样本图像识别的高效算法框架[J]. 西安电子科技大学学报, 2022, 49(6):103-110.

ZHANG Zehuan, LIU Qiang, GUO Difei. High Efficient Framework for Large-Scale Zero-Shot Image Recognition[J]. Journal of Xidian University, 2022, 49(6):103-110.

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

边界加权的甲状腺癌病理图像细胞核分割方法

一种利用量测空间聚类的多帧检测前跟踪算法

不平衡数据加权边界点集成欠采样方法

一种深度学习的硬件木马检测算法

采用k均值聚类的物理层密钥生成方案