Indexed by:
Abstract:
Many practical applications impose a new challenge of utilizing instance-level background knowledge (e.g., subsets of similar or dissimilar data points) within their input data to improve clustering results. In this work, we build on the widely adopted k-center clustering, modeling its input instance-level background knowledge as must-link (ML) and cannot-link (CL) constraint sets, and formulate the constrained k-center problem. Given the long-standing challenge of developing efficient algorithms for constrained clustering problems, we first derive an efficient approximation algorithm for constrained k-center at the best possible approximation ratio of 2 with linear programming (LP)-rounding technology. Recognizing the limitations of LP-rounding algorithms including high runtime complexity and challenges in parallelization, we subsequently develop a greedy algorithm that does not rely on the LP and can be efficiently parallelized. This algorithm also achieves the same approximation ratio 2 but with lower runtime complexity. Lastly, we empirically evaluate our approximation algorithm against baselines on various real datasets, validating our theoretical findings and demonstrating significant advantages of our algorithm in terms of clustering cost, quality, and runtime complexity.
Keyword:
Reprint 's Address:
Email:
Source :
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
ISSN: 2162-237X
Year: 2025
1 0 . 2 0 0
JCR@2023
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 0