TY - GEN
T1 - Parameter Selection for DBSCAN: Insights from Persistent Homology
AU - Beer, Anna
AU - Kuznetsova, Ekaterina
AU - Plant, Claudia
N1 - Publisher Copyright:
© 2025 The Authors.
PY - 2025/10/1
Y1 - 2025/10/1
N2 - Density-based clustering algorithms like DBSCAN are highly effective but sensitive to parameter selection, particularly the neighborhood radius (ϵ) and the minimum number of neighboring points to form a cluster (minPts). We analyze and investigate the influence of the parameter settings onto the clustering outcome under the lense of persistent homology, a technique from topological data analysis. Persistent homology analyzes topological features, such as connected components and loops, across multiple spatial scales, improving clustering accuracy and robustness. We use the density-connectivity distance, a recent finding in the field, to allow full automatization of our approach. In extensive experiments, we demonstrate how insights from persistent homology can help to identify optimal parameter values and introduce an approach to automate parameter selection for density-based clustering. The proposed technique allows DBSCAN and related algorithms to perform effectively on a large variety of datasets without any user input. It combines topological insights with clustering techniques to provide a foundation for robust, automated approaches to complex data analysis.
AB - Density-based clustering algorithms like DBSCAN are highly effective but sensitive to parameter selection, particularly the neighborhood radius (ϵ) and the minimum number of neighboring points to form a cluster (minPts). We analyze and investigate the influence of the parameter settings onto the clustering outcome under the lense of persistent homology, a technique from topological data analysis. Persistent homology analyzes topological features, such as connected components and loops, across multiple spatial scales, improving clustering accuracy and robustness. We use the density-connectivity distance, a recent finding in the field, to allow full automatization of our approach. In extensive experiments, we demonstrate how insights from persistent homology can help to identify optimal parameter values and introduce an approach to automate parameter selection for density-based clustering. The proposed technique allows DBSCAN and related algorithms to perform effectively on a large variety of datasets without any user input. It combines topological insights with clustering techniques to provide a foundation for robust, automated approaches to complex data analysis.
UR - https://www.scopus.com/pages/publications/105024495324
U2 - 10.3233/FAIA251192
DO - 10.3233/FAIA251192
M3 - Contribution to proceedings
VL - 413
T3 - Frontiers in Artificial Intelligence and Applications
SP - 3250
EP - 3257
BT - ECAI 2025
A2 - Lynce, Inês
A2 - Murano, Nello
A2 - Vallati, Mauro
A2 - Villata, Serena
A2 - Chesani, Federico
A2 - Milano, Michela
A2 - Omicini, Andrea
A2 - Dastani, Mehdi
PB - IOS Press
ER -