Energy Minimization over m-Branched Enumeration for Generalized Linear Subspace Clustering

Chao ZHANG  

IEICE TRANSACTIONS on Information and Systems   Vol.E102-D   No.12   pp.2485-2492
Publication Date: 2019/12/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2019EDP7138
Type of Manuscript: PAPER
Category: Artificial Intelligence, Data Mining
general subspace clustering,  energy minimization,  multi-branch enumeration,  

Full Text: FreePDF(832KB)

In this paper, we consider the clustering problem of independent general subspaces. That is, with given data points lay near or on the union of independent low-dimensional linear subspaces, we aim to recover the subspaces and assign the corresponding label to each data point. To settle this problem, we take advantages of both greedy strategy and energy minimization strategy to propose a simple yet effective algorithm based on the assumption that an m-branched (i.e., perfect m-ary) tree which is constructed by collecting m-nearest neighbor points in each node has a high probability of containing the near-exact subspace. Specifically, at first, subspace candidates are enumerated by multiple m-branched trees. Each tree starts with a data point and grows by collecting nearest neighbors in the breadth-first search order. Then, subspace proposals are further selected from the enumeration to initialize the energy minimization algorithm. Eventually, both the proposals and the labeling result are finalized by iterative re-estimation and labeling. Experiments with both synthetic and real-world data show that the proposed method can outperform state-of-the-art methods and is practical in real application.