Options
EXPONENTIALLY CONSISTENT NONPARAMETRIC CLUSTERING OF DATA STREAMS WITH COMPOSITE DISTRIBUTIONS
Journal
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN
15206149
Date Issued
2024-01-01
Author(s)
Abstract
This paper focuses on nonparametric clustering of data streams generated from unknown distributions. Existing results on exponentially consistent nonparametric clustering assume that the maximum intra-cluster distance (dL) is smaller than the minimum inter-cluster distance (dH). We show that exponential consistency can be achieved for single linkage-based (SLINK) clustering under a less strict assumption, dI < dH, where dI is the maximum intra-cluster nearest neighbour distance. Note that dI < dL in general. Then, we propose a sequential clustering algorithm based on SLINK. Simulation results show that the sequential SLINK algorithm requires fewer expected number of samples than the fixed-sample size SLINK algorithm for the same probability of error. We also identify examples where k-medoids clustering is unable to find the true clusters, but SLINK is exponentially consistent.
Subjects