ITGC: Information-theoretic Grid-based Clustering

Sahar Behzadi Soheil, Hermann Hinterhauser (Contributor), Claudia Plant

Publications: Contribution to bookContribution to proceedingsPeer Reviewed

Abstract

Grid-based clustering algorithms are well-known due to their efficiency in terms of the fast processing time. On the other hand, when dealing with arbitrary shaped data sets, density-based methods are most of the time the best options. Accordingly, a combination of grid and density-based methods, where the advantages of both approaches are achievable, sounds interesting. However, most of the algorithms in these categories require a set of parameters to be specified while usually it is not trivial to appropriately set them. Thus, we propose an Information-Theoretic Grid-based Clustering (ITGC) algorithm by regarding the clustering as a data compression problem. That is, we merge the neighbour grid cells (clusters) when it pays off in terms of the compression cost. Our extensive synthetic and real-world experiments show the advantages of ITGC compared to the well-known clustering algorithms.

Original languageEnglish
Title of host publicationEDBT 2019
Subtitle of host publication22nd International Conference on Extending Database Technology, Proceedings
EditorsBerthold Reinwald, Melanie Herschel, Irini Fundulaki, Zoi Kaoudi, Helena Galhardas, Carsten Binnig
Publisheropen proceedings
Pages618-621
Number of pages4
ISBN (Electronic)978-3-89318-081-3
DOIs
Publication statusPublished - 2019

Publication series

SeriesAdvances in Database Technology

Austrian Fields of Science 2012

  • 102033 Data mining

Fingerprint

Dive into the research topics of 'ITGC: Information-theoretic Grid-based Clustering'. Together they form a unique fingerprint.

Cite this