For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
A Rate-Distortion Theoretic View of Dirichlet Process Means Clustering
Masahiro KOBAYASHI Kazuho WATANABE
A - Abstracts of IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences (Japanese Edition)
Publication Date: 2017/12/01
Online ISSN: 1881-0195
Type of Manuscript: PAPER
clustering, Dirichlet process, rate-distortion curve, lossy compression,
Full Text(in Japanese): PDF(1017.4KB)>>
DP-means clustering was devised as an extension of K-means clustering. It automatically estimates the number of clusters from data by specifying a penalty parameter. However, it is unknown how the estimated number of clusters changes against the penalty parameter and how to determine its proper value. This study considers the relationship between DP-means and the rate-distortion curve and demonstrates that the profile of the number of clusters approaches the rate-distortion curve in the high-dimensional limit. Through numerical experiments, we verify that the penalty parameter behaves like the maximum distortion in training data.