Fast DPP Sampling for Nystrom with Application to Kernel Methods

Chengtao Li; Stefanie Jegelka; Suvrit Sra

Fast DPP Sampling for Nystrom with Application to Kernel Methods

Chengtao Li, Stefanie Jegelka, Suvrit Sra

Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:2061-2070, 2016.

Abstract

The Nystrom method has long been popular for scaling up kernel methods. Its theoretical guarantees and empirical performance rely critically on the quality of the landmarks selected. We study landmark selection for Nystrom using Determinantal Point Processes (DPPs), discrete probability models that allow tractable generation of diverse samples. We prove that landmarks selected via DPPs guarantee bounds on approximation errors; subsequently, we analyze implications for kernel ridge regression. Contrary to prior reservations due to cubic complexity of DPP sampling, we show that (under certain conditions) Markov chain DPP sampling requires only linear time in the size of the data. We present several empirical results that support our theoretical analysis, and demonstrate the superior performance of DPP-based landmark selection compared with existing approaches.

Cite this Paper

BibTeX


@InProceedings{pmlr-v48-lih16,
  title = 	 {Fast DPP Sampling for Nystrom with Application to Kernel Methods},
  author = 	 {Li, Chengtao and Jegelka, Stefanie and Sra, Suvrit},
  booktitle = 	 {Proceedings of The 33rd International Conference on Machine Learning},
  pages = 	 {2061--2070},
  year = 	 {2016},
  editor = 	 {Balcan, Maria Florina and Weinberger, Kilian Q.},
  volume = 	 {48},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {New York, New York, USA},
  month = 	 {20--22 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v48/lih16.pdf},
  url = 	 {https://proceedings.mlr.press/v48/lih16.html},
  abstract = 	 {The Nystrom method has long been popular for scaling up kernel methods. Its theoretical guarantees and empirical performance rely critically on the quality of the landmarks selected. We study landmark selection for Nystrom using Determinantal Point Processes (DPPs), discrete probability models that allow tractable generation of diverse samples. We prove that landmarks selected via DPPs guarantee bounds on approximation errors; subsequently, we analyze implications for kernel ridge regression. Contrary to prior reservations due to cubic complexity of DPP sampling, we show that (under certain conditions) Markov chain DPP sampling requires only linear time in the size of the data. We present several empirical results that support our theoretical analysis, and demonstrate the superior performance of DPP-based landmark selection compared with existing approaches.}
}

Endnote

%0 Conference Paper
%T Fast DPP Sampling for Nystrom with Application to Kernel Methods
%A Chengtao Li
%A Stefanie Jegelka
%A Suvrit Sra
%B Proceedings of The 33rd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Maria Florina Balcan
%E Kilian Q. Weinberger	
%F pmlr-v48-lih16
%I PMLR
%P 2061--2070
%U https://proceedings.mlr.press/v48/lih16.html
%V 48
%X The Nystrom method has long been popular for scaling up kernel methods. Its theoretical guarantees and empirical performance rely critically on the quality of the landmarks selected. We study landmark selection for Nystrom using Determinantal Point Processes (DPPs), discrete probability models that allow tractable generation of diverse samples. We prove that landmarks selected via DPPs guarantee bounds on approximation errors; subsequently, we analyze implications for kernel ridge regression. Contrary to prior reservations due to cubic complexity of DPP sampling, we show that (under certain conditions) Markov chain DPP sampling requires only linear time in the size of the data. We present several empirical results that support our theoretical analysis, and demonstrate the superior performance of DPP-based landmark selection compared with existing approaches.

RIS


TY  - CPAPER
TI  - Fast DPP Sampling for Nystrom with Application to Kernel Methods
AU  - Chengtao Li
AU  - Stefanie Jegelka
AU  - Suvrit Sra
BT  - Proceedings of The 33rd International Conference on Machine Learning
DA  - 2016/06/11
ED  - Maria Florina Balcan
ED  - Kilian Q. Weinberger	
ID  - pmlr-v48-lih16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 48
SP  - 2061
EP  - 2070
L1  - http://proceedings.mlr.press/v48/lih16.pdf
UR  - https://proceedings.mlr.press/v48/lih16.html
AB  - The Nystrom method has long been popular for scaling up kernel methods. Its theoretical guarantees and empirical performance rely critically on the quality of the landmarks selected. We study landmark selection for Nystrom using Determinantal Point Processes (DPPs), discrete probability models that allow tractable generation of diverse samples. We prove that landmarks selected via DPPs guarantee bounds on approximation errors; subsequently, we analyze implications for kernel ridge regression. Contrary to prior reservations due to cubic complexity of DPP sampling, we show that (under certain conditions) Markov chain DPP sampling requires only linear time in the size of the data. We present several empirical results that support our theoretical analysis, and demonstrate the superior performance of DPP-based landmark selection compared with existing approaches.
ER  -

APA


Li, C., Jegelka, S. & Sra, S.. (2016). Fast DPP Sampling for Nystrom with Application to Kernel Methods. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:2061-2070 Available from https://proceedings.mlr.press/v48/lih16.html.

Fast DPP Sampling for Nystrom with Application to Kernel Methods

Abstract

Cite this Paper

Related Material