Open Access System for Information Sharing

Login Library

 

Thesis
Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

하이퍼파라미터 최적화를 통한 제약 DBSCAN

Title
하이퍼파라미터 최적화를 통한 제약 DBSCAN
Authors
김종원
Date Issued
2022
Publisher
포항공과대학교
Abstract
We study how to find hyperparameters of Density-Based Spatial Clustering of Applications with Noise (DBSCAN) with good performance while satisfying both cluster-level and instance-level constraints. DBSCAN is widely used for clustering due to its ability to deal with non-convex data with arbitrary shapes. DBSCAN automatically selects the number of clusters but has the disadvantage that it cannot reflect prior knowledge when we have an appropriate number of clusters as prior knowledge. In order to reflect prior knowledge to clustering, there have been quite a few attempts to transform prior knowledge into constraints. However, it is not straightforward to add several cluster constraints, which is a cluster-level constraint, to DBSCAN. Most previous works focused on instance-level constraints (must-link, cannot-link, semi-supervised). Recently, optimization techniques have been increasing to add cluster-level constraints (maximum diameter of a cluster, etc) with decision variables that affect overall clustering performance. Since DBSCAN is very sensitive to hyperparameters, we propose a Hyperparameter optimization with these two levels of Constraints for DBSCAN (HC-DBSCAN). We resolve several challenges of evaluation metrics of DBSCAN hyperparameter optimization. We introduce a new objective function for local optimization, the Penalized Davies Bouldin score. It is efficient and deals with noise clusters and appropriate clustering performance measures. We use the idea of DBCV and the idea that a local optimization method for a region that satisfies the constraint performs well enough. The objective function and constraint functions are black-box. We use the Alternating Direction Method of Multiplier Bayesian Optimization (ADMMBO) to solve the black-box problem. We show our approach’s effectiveness through numerical experiments with simulated and real datasets. HC-DBSCAN is efficient, solves constraints better than the benchmark, and shows good clustering performance.
URI
http://postech.dcollection.net/common/orgView/200000597576
https://oasis.postech.ac.kr/handle/2014.oak/112161
Article Type
Thesis
Files in This Item:
There are no files associated with this item.

qr_code

  • mendeley

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Views & Downloads

Browse