New Algorithms for Learning Incoherent and Overcomplete Dictionaries

Sanjeev Arora; Rong Ge; Ankur Moitra

New Algorithms for Learning Incoherent and Overcomplete Dictionaries

Sanjeev Arora, Rong Ge, Ankur Moitra

Proceedings of The 27th Conference on Learning Theory, PMLR 35:779-806, 2014.

Abstract

In \em sparse recovery we are given a matrix A ∈\mathbbR^n\times m (“the dictionary”) and a vector of the form A X where X is \em sparse, and the goal is to recover X. This is a central notion in signal processing, statistics and machine learning. But in applications such as \em sparse coding, edge detection, compression and super resolution, the dictionary A is unknown and has to be learned from random examples of the form Y = AX where X is drawn from an appropriate distribution — this is the \em dictionary learning problem. In most settings, A is \em overcomplete: it has more columns than rows. This paper presents a polynomial-time algorithm for learning overcomplete dictionaries; the only previously known algorithm with provable guarantees is the recent work of Spielman et al. (2012) who who gave an algorithm for the undercomplete case, which is rarely the case in applications. Our algorithm applies to \em incoherent dictionaries which have been a central object of study since they were introduced in seminal work of Donoho and Huo (1999). In particular, a dictionary is μ-incoherent if each pair of columns has inner product at most μ/ \sqrtn. The algorithm makes natural stochastic assumptions about the unknown sparse vector X, which can contain k ≤c \min(\sqrtn/μ\log n, m^1/2 - η) non-zero entries (for any η> 0). This is close to the best k allowable by the best sparse recovery algorithms \em even if one knows the dictionary A exactly. Moreover, both the running time and sample complexity depend on \log 1/ε, where εis the target accuracy, and so our algorithms converge very quickly to the true dictionary. Our algorithm can also tolerate substantial amounts of noise provided it is incoherent with respect to the dictionary (e.g., Gaussian). In the noisy setting, our running time and sample complexity depend polynomially on 1/ε, and this is necessary.

Cite this Paper

BibTeX


@InProceedings{pmlr-v35-arora14,
  title = 	 {New Algorithms for Learning Incoherent and Overcomplete Dictionaries },
  author = 	 {Arora, Sanjeev and Ge, Rong and Moitra, Ankur},
  booktitle = 	 {Proceedings of The 27th Conference on Learning Theory},
  pages = 	 {779--806},
  year = 	 {2014},
  editor = 	 {Balcan, Maria Florina and Feldman, Vitaly and Szepesvári, Csaba},
  volume = 	 {35},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Barcelona, Spain},
  month = 	 {13--15 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v35/arora14.pdf},
  url = 	 {https://proceedings.mlr.press/v35/arora14.html},
  abstract = 	 {In \em sparse recovery we are given a matrix A ∈\mathbbR^n\times m (“the dictionary”) and a vector of the form A X where X is \em sparse, and the goal is to recover X. This is a central notion in signal processing, statistics and machine learning. But in applications such as \em sparse coding, edge detection, compression and super resolution, the dictionary A is unknown and has to be learned from random examples of the form Y = AX where X is drawn from an appropriate distribution — this is the \em dictionary learning problem. In most settings, A is \em overcomplete: it has more columns than rows. This paper presents a polynomial-time algorithm for learning overcomplete dictionaries; the only previously known algorithm with provable guarantees is the recent work of Spielman et al. (2012) who who gave an algorithm for the undercomplete case, which is rarely the case in applications. Our algorithm applies to \em incoherent dictionaries which have been a central object of study since they were introduced in seminal work of Donoho and Huo (1999). In particular, a dictionary is μ-incoherent if each pair of columns has inner product at most μ/ \sqrtn. The algorithm makes natural stochastic assumptions about the unknown sparse vector X, which can contain k ≤c \min(\sqrtn/μ\log n, m^1/2 - η) non-zero entries (for any η> 0). This is close to the best k allowable by the best sparse recovery algorithms  \em even if one knows the dictionary A exactly. Moreover, both the running time and sample complexity depend on \log 1/ε, where εis the target accuracy, and so our algorithms converge very quickly to the true dictionary. Our algorithm can also tolerate substantial amounts of noise provided it is incoherent with respect to the dictionary (e.g., Gaussian). In the noisy setting, our running time and sample complexity depend polynomially on 1/ε, and this is necessary. }
}

Endnote

%0 Conference Paper
%T New Algorithms for Learning Incoherent and Overcomplete Dictionaries 
%A Sanjeev Arora
%A Rong Ge
%A Ankur Moitra
%B Proceedings of The 27th Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2014
%E Maria Florina Balcan
%E Vitaly Feldman
%E Csaba Szepesvári	
%F pmlr-v35-arora14
%I PMLR
%P 779--806
%U https://proceedings.mlr.press/v35/arora14.html
%V 35
%X In \em sparse recovery we are given a matrix A ∈\mathbbR^n\times m (“the dictionary”) and a vector of the form A X where X is \em sparse, and the goal is to recover X. This is a central notion in signal processing, statistics and machine learning. But in applications such as \em sparse coding, edge detection, compression and super resolution, the dictionary A is unknown and has to be learned from random examples of the form Y = AX where X is drawn from an appropriate distribution — this is the \em dictionary learning problem. In most settings, A is \em overcomplete: it has more columns than rows. This paper presents a polynomial-time algorithm for learning overcomplete dictionaries; the only previously known algorithm with provable guarantees is the recent work of Spielman et al. (2012) who who gave an algorithm for the undercomplete case, which is rarely the case in applications. Our algorithm applies to \em incoherent dictionaries which have been a central object of study since they were introduced in seminal work of Donoho and Huo (1999). In particular, a dictionary is μ-incoherent if each pair of columns has inner product at most μ/ \sqrtn. The algorithm makes natural stochastic assumptions about the unknown sparse vector X, which can contain k ≤c \min(\sqrtn/μ\log n, m^1/2 - η) non-zero entries (for any η> 0). This is close to the best k allowable by the best sparse recovery algorithms  \em even if one knows the dictionary A exactly. Moreover, both the running time and sample complexity depend on \log 1/ε, where εis the target accuracy, and so our algorithms converge very quickly to the true dictionary. Our algorithm can also tolerate substantial amounts of noise provided it is incoherent with respect to the dictionary (e.g., Gaussian). In the noisy setting, our running time and sample complexity depend polynomially on 1/ε, and this is necessary.

RIS


TY  - CPAPER
TI  - New Algorithms for Learning Incoherent and Overcomplete Dictionaries 
AU  - Sanjeev Arora
AU  - Rong Ge
AU  - Ankur Moitra
BT  - Proceedings of The 27th Conference on Learning Theory
DA  - 2014/05/29
ED  - Maria Florina Balcan
ED  - Vitaly Feldman
ED  - Csaba Szepesvári	
ID  - pmlr-v35-arora14
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 35
SP  - 779
EP  - 806
L1  - http://proceedings.mlr.press/v35/arora14.pdf
UR  - https://proceedings.mlr.press/v35/arora14.html
AB  - In \em sparse recovery we are given a matrix A ∈\mathbbR^n\times m (“the dictionary”) and a vector of the form A X where X is \em sparse, and the goal is to recover X. This is a central notion in signal processing, statistics and machine learning. But in applications such as \em sparse coding, edge detection, compression and super resolution, the dictionary A is unknown and has to be learned from random examples of the form Y = AX where X is drawn from an appropriate distribution — this is the \em dictionary learning problem. In most settings, A is \em overcomplete: it has more columns than rows. This paper presents a polynomial-time algorithm for learning overcomplete dictionaries; the only previously known algorithm with provable guarantees is the recent work of Spielman et al. (2012) who who gave an algorithm for the undercomplete case, which is rarely the case in applications. Our algorithm applies to \em incoherent dictionaries which have been a central object of study since they were introduced in seminal work of Donoho and Huo (1999). In particular, a dictionary is μ-incoherent if each pair of columns has inner product at most μ/ \sqrtn. The algorithm makes natural stochastic assumptions about the unknown sparse vector X, which can contain k ≤c \min(\sqrtn/μ\log n, m^1/2 - η) non-zero entries (for any η> 0). This is close to the best k allowable by the best sparse recovery algorithms  \em even if one knows the dictionary A exactly. Moreover, both the running time and sample complexity depend on \log 1/ε, where εis the target accuracy, and so our algorithms converge very quickly to the true dictionary. Our algorithm can also tolerate substantial amounts of noise provided it is incoherent with respect to the dictionary (e.g., Gaussian). In the noisy setting, our running time and sample complexity depend polynomially on 1/ε, and this is necessary. 
ER  -

APA


Arora, S., Ge, R. & Moitra, A.. (2014). New Algorithms for Learning Incoherent and Overcomplete Dictionaries . Proceedings of The 27th Conference on Learning Theory, in Proceedings of Machine Learning Research 35:779-806 Available from https://proceedings.mlr.press/v35/arora14.html.

Related Material

Download PDF