Two-Sample Test with Kernel Projected Wasserstein Distance

Jie Wang, Rui Gao, Yao Xie
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:8022-8055, 2022.

Abstract

We develop a kernel projected Wasserstein distance for the two-sample test, an essential building block in statistics and machine learning: given two sets of samples, to determine whether they are from the same distribution. This method operates by finding the nonlinear mapping in the data space which maximizes the distance between projected distributions. In contrast to existing works about projected Wasserstein distance, the proposed method circumvents the curse of dimensionality more efficiently. We present practical algorithms for computing this distance function together with the non-asymptotic uncertainty quantification of empirical estimates. Numerical examples validate our theoretical results and demonstrate good performance of the proposed method.

Cite this Paper


BibTeX
@InProceedings{pmlr-v151-wang22f, title = { Two-Sample Test with Kernel Projected Wasserstein Distance }, author = {Wang, Jie and Gao, Rui and Xie, Yao}, booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics}, pages = {8022--8055}, year = {2022}, editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel}, volume = {151}, series = {Proceedings of Machine Learning Research}, month = {28--30 Mar}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v151/wang22f/wang22f.pdf}, url = {https://proceedings.mlr.press/v151/wang22f.html}, abstract = { We develop a kernel projected Wasserstein distance for the two-sample test, an essential building block in statistics and machine learning: given two sets of samples, to determine whether they are from the same distribution. This method operates by finding the nonlinear mapping in the data space which maximizes the distance between projected distributions. In contrast to existing works about projected Wasserstein distance, the proposed method circumvents the curse of dimensionality more efficiently. We present practical algorithms for computing this distance function together with the non-asymptotic uncertainty quantification of empirical estimates. Numerical examples validate our theoretical results and demonstrate good performance of the proposed method. } }
Endnote
%0 Conference Paper %T Two-Sample Test with Kernel Projected Wasserstein Distance %A Jie Wang %A Rui Gao %A Yao Xie %B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2022 %E Gustau Camps-Valls %E Francisco J. R. Ruiz %E Isabel Valera %F pmlr-v151-wang22f %I PMLR %P 8022--8055 %U https://proceedings.mlr.press/v151/wang22f.html %V 151 %X We develop a kernel projected Wasserstein distance for the two-sample test, an essential building block in statistics and machine learning: given two sets of samples, to determine whether they are from the same distribution. This method operates by finding the nonlinear mapping in the data space which maximizes the distance between projected distributions. In contrast to existing works about projected Wasserstein distance, the proposed method circumvents the curse of dimensionality more efficiently. We present practical algorithms for computing this distance function together with the non-asymptotic uncertainty quantification of empirical estimates. Numerical examples validate our theoretical results and demonstrate good performance of the proposed method.
APA
Wang, J., Gao, R. & Xie, Y.. (2022). Two-Sample Test with Kernel Projected Wasserstein Distance . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:8022-8055 Available from https://proceedings.mlr.press/v151/wang22f.html.

Related Material