Improved Bound on Generalization Error of Compressed KNN Estimator

Hang Zhang, Ping Li
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:7578-7593, 2023.

Abstract

This paper studies the generalization capability of the compressed $k$-nearest neighbor (KNN) estimator, where randomly-projected low-dimensional data are put into the KNN estimator rather than the high-dimensional raw data. Considering both regression and classification, we give improved bounds on its generalization errors, to put more specific, $\ell_2$ error for regression and mis-classification rate for classification. As a byproduct of our analysis, we prove that ordered distance is almost preserved with random projections, which we believe is for the first time. In addition, we provide numerical experiments on various public datasets to verify our theorems.

Cite this Paper


BibTeX
@InProceedings{pmlr-v206-zhang23j, title = {Improved Bound on Generalization Error of Compressed KNN Estimator}, author = {Zhang, Hang and Li, Ping}, booktitle = {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics}, pages = {7578--7593}, year = {2023}, editor = {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem}, volume = {206}, series = {Proceedings of Machine Learning Research}, month = {25--27 Apr}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v206/zhang23j/zhang23j.pdf}, url = {https://proceedings.mlr.press/v206/zhang23j.html}, abstract = {This paper studies the generalization capability of the compressed $k$-nearest neighbor (KNN) estimator, where randomly-projected low-dimensional data are put into the KNN estimator rather than the high-dimensional raw data. Considering both regression and classification, we give improved bounds on its generalization errors, to put more specific, $\ell_2$ error for regression and mis-classification rate for classification. As a byproduct of our analysis, we prove that ordered distance is almost preserved with random projections, which we believe is for the first time. In addition, we provide numerical experiments on various public datasets to verify our theorems.} }
Endnote
%0 Conference Paper %T Improved Bound on Generalization Error of Compressed KNN Estimator %A Hang Zhang %A Ping Li %B Proceedings of The 26th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2023 %E Francisco Ruiz %E Jennifer Dy %E Jan-Willem van de Meent %F pmlr-v206-zhang23j %I PMLR %P 7578--7593 %U https://proceedings.mlr.press/v206/zhang23j.html %V 206 %X This paper studies the generalization capability of the compressed $k$-nearest neighbor (KNN) estimator, where randomly-projected low-dimensional data are put into the KNN estimator rather than the high-dimensional raw data. Considering both regression and classification, we give improved bounds on its generalization errors, to put more specific, $\ell_2$ error for regression and mis-classification rate for classification. As a byproduct of our analysis, we prove that ordered distance is almost preserved with random projections, which we believe is for the first time. In addition, we provide numerical experiments on various public datasets to verify our theorems.
APA
Zhang, H. & Li, P.. (2023). Improved Bound on Generalization Error of Compressed KNN Estimator. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:7578-7593 Available from https://proceedings.mlr.press/v206/zhang23j.html.

Related Material