Abstract
Hartigan's Dip-test of unimodality gained increasing interest in unsupervised learning over the past few years. It is free from complex parameterization and does not require a distribution assumed a priori. A useful property is that the resulting Dip-values can be derived to find a projection axis that identifies multimodal structures in the data set. In this paper, we show how to apply the gradient not only with respect to the projection axis but also with respect to the data to improve the cluster structure. By tightly coupling the Dip-test with an autoencoder, we obtain an embedding that clearly separates all clusters in the data set. This method, called DipEncoder, is the basis of a novel deep clustering algorithm. Extensive experiments show that the DipEncoder is highly competitive to state-of-the-art methods.
Original language | English |
---|---|
Title of host publication | KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14 - 18, 2022 |
Editors | Aidong Zhang, Huzefa Rangwala |
Publisher | ACM |
Pages | 846-856 |
Number of pages | 11 |
ISBN (Electronic) | 9781450393850 |
DOIs | |
Publication status | Published - 2022 |
Austrian Fields of Science 2012
- 102033 Data mining
Keywords
- deep clustering
- dimensionality reduction
- hartigan's dip-test