Improved Bound on Generalization Error of Compressed KNN Estimator

Hang Zhang; Ping Li

Improved Bound on Generalization Error of Compressed KNN Estimator

Hang Zhang, Ping Li

Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:7578-7593, 2023.

Abstract

This paper studies the generalization capability of the compressed

$k$ -nearest neighbor (KNN) estimator, where randomly-projected low-dimensional data are put into the KNN estimator rather than the high-dimensional raw data. Considering both regression and classification, we give improved bounds on its generalization errors, to put more specific,

$\ell_2$ error for regression and mis-classification rate for classification. As a byproduct of our analysis, we prove that ordered distance is almost preserved with random projections, which we believe is for the first time. In addition, we provide numerical experiments on various public datasets to verify our theorems.

Cite this Paper

BibTeX


@InProceedings{pmlr-v206-zhang23j,
  title = 	 {Improved Bound on Generalization Error of Compressed KNN Estimator},
  author =       {Zhang, Hang and Li, Ping},
  booktitle = 	 {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {7578--7593},
  year = 	 {2023},
  editor = 	 {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem},
  volume = 	 {206},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {25--27 Apr},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v206/zhang23j/zhang23j.pdf},
  url = 	 {https://proceedings.mlr.press/v206/zhang23j.html},
  abstract = 	 {This paper studies the generalization capability of the compressed $k$-nearest neighbor (KNN) estimator, where randomly-projected low-dimensional data are put into the KNN estimator rather than the high-dimensional raw data. Considering both regression and classification, we give improved bounds on its generalization errors, to put more specific, $\ell_2$ error for regression and mis-classification rate for classification. As a byproduct of our analysis, we prove that ordered distance is almost preserved with random projections, which we believe is for the first time. In addition, we provide numerical experiments on various public datasets to verify our theorems.}
}

Endnote

%0 Conference Paper
%T Improved Bound on Generalization Error of Compressed KNN Estimator
%A Hang Zhang
%A Ping Li
%B Proceedings of The 26th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2023
%E Francisco Ruiz
%E Jennifer Dy
%E Jan-Willem van de Meent	
%F pmlr-v206-zhang23j
%I PMLR
%P 7578--7593
%U https://proceedings.mlr.press/v206/zhang23j.html
%V 206
%X This paper studies the generalization capability of the compressed $k$-nearest neighbor (KNN) estimator, where randomly-projected low-dimensional data are put into the KNN estimator rather than the high-dimensional raw data. Considering both regression and classification, we give improved bounds on its generalization errors, to put more specific, $\ell_2$ error for regression and mis-classification rate for classification. As a byproduct of our analysis, we prove that ordered distance is almost preserved with random projections, which we believe is for the first time. In addition, we provide numerical experiments on various public datasets to verify our theorems.

APA


Zhang, H. & Li, P.. (2023). Improved Bound on Generalization Error of Compressed KNN Estimator. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:7578-7593 Available from https://proceedings.mlr.press/v206/zhang23j.html.

Improved Bound on Generalization Error of Compressed KNN Estimator

Abstract

Cite this Paper

Related Material