• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
成果搜索

author:

Cheng, Fang (Cheng, Fang.) [1]

Indexed by:

EI Scopus

Abstract:

Currently, with the widespread use of distributed information, traditional clustering algorithms can no longer meet the processing needs of massive information both in terms of accuracy and computational efficiency, so clustering algorithms based on the Spark distributed platform have become today's research hotspots. The K-Means algorithm can be widely used in both academic research and business by virtue of its ease of implementation and high scalability. However, the traditional K-Means is based on Euclidean distance, which is not applicable in some scenarios. And it is inefficient when dealing with large-scale data. This study implements and tests K-means, K-means++, Canopy+ K-means algorithms based on Euclidean distance and Manhattan distance in Spark distributed platform and analyses the performance changes. The experimental results show that the introduction of Manhattan distance makes the clustering time longer and the optimisation effect of different algorithms changes differently. © 2024 IEEE.

Keyword:

K-means clustering

Community:

  • [ 1 ] [Cheng, Fang]Fuzhou University, Fujian; 350000, China

Reprint 's Address:

  • 待查

Email:

Show more details

Version:

Related Keywords:

Related Article:

Source :

Year: 2024

Page: 15-20

Language: English

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count:

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 1

Online/Total:91/10067901
Address:FZU Library(No.2 Xuyuan Road, Fuzhou, Fujian, PRC Post Code:350116) Contact Us:0591-22865326
Copyright:FZU Library Technical Support:Beijing Aegean Software Co., Ltd. 闽ICP备05005463号-1