TY - GEN
T1 - ITGC: Information-theoretic Grid-based Clustering
AU - Behzadi Soheil, Sahar
AU - Plant, Claudia
A2 - Hinterhauser, Hermann
A2 - Reinwald, Berthold
A2 - Herschel, Melanie
A2 - Fundulaki, Irini
A2 - Kaoudi, Zoi
A2 - Galhardas, Helena
A2 - Binnig, Carsten
N1 - Publisher Copyright:
© 2019 Copyright held by the owner/author(s).
PY - 2019
Y1 - 2019
N2 - Grid-based clustering algorithms are well-known due to their efficiency in terms of the fast processing time. On the other hand, when dealing with arbitrary shaped data sets, density-based methods are most of the time the best options. Accordingly, a combination of grid and density-based methods, where the advantages of both approaches are achievable, sounds interesting. However, most of the algorithms in these categories require a set of parameters to be specified while usually it is not trivial to appropriately set them. Thus, we propose an Information-Theoretic Grid-based Clustering (ITGC) algorithm by regarding the clustering as a data compression problem. That is, we merge the neighbour grid cells (clusters) when it pays off in terms of the compression cost. Our extensive synthetic and real-world experiments show the advantages of ITGC compared to the well-known clustering algorithms.
AB - Grid-based clustering algorithms are well-known due to their efficiency in terms of the fast processing time. On the other hand, when dealing with arbitrary shaped data sets, density-based methods are most of the time the best options. Accordingly, a combination of grid and density-based methods, where the advantages of both approaches are achievable, sounds interesting. However, most of the algorithms in these categories require a set of parameters to be specified while usually it is not trivial to appropriately set them. Thus, we propose an Information-Theoretic Grid-based Clustering (ITGC) algorithm by regarding the clustering as a data compression problem. That is, we merge the neighbour grid cells (clusters) when it pays off in terms of the compression cost. Our extensive synthetic and real-world experiments show the advantages of ITGC compared to the well-known clustering algorithms.
UR - https://openproceedings.org/html/pages/2019_edbt.html
UR - http://www.scopus.com/inward/record.url?scp=85064930596&partnerID=8YFLogxK
U2 - 10.5441/002/edbt.2019.70
DO - 10.5441/002/edbt.2019.70
M3 - Contribution to proceedings
T3 - Advances in Database Technology
SP - 618
EP - 621
BT - EDBT 2019
PB - open proceedings
ER -