[edit]
Deep Self-expressive Learning
Conference on Parsimony and Learning, PMLR 234:228-247, 2024.
Abstract
Self-expressive model is a method for clustering data drawn from a union of low-dimensional linear subspaces. It gains a lot of popularity due to its: 1) simplicity, based on the observation that each data point can be expressed as a linear combination of the other data points, 2) provable correctness under broad geometric and statistical conditions, and 3) many extensions for handling corrupted, imbalanced, and large-scale real data. This paper extends the self-expressive model to a Deep sELf-expressiVE model (DELVE) for handling the more challenging case that the data lie in a union of nonlinear manifolds. DELVE is constructed from stacking self-expressive layers, each of which maps each data point to a linear combination of the other data points, and can be trained via minimizing self-expressive losses. With such a design, the operator, architecture, and training of DELVE have the explicit interpretation of producing progressively linearized representations from the input data in nonlinear manifolds. Moreover, by leveraging existing understanding and techniques for self-expressive models, DELVE has a collection of benefits such as design choice by principles, robustness via specialized layers, and efficiency via specialized optimizers. We demonstrate on image datasets that DELVE can effectively perform data clustering, remove data corruptions, and handle large scale data.