Efficient Statistical Tests: A Neural Tangent Kernel Approach

Sheng Jia, Ehsan Nezhadarya, Yuhuai Wu, Jimmy Ba
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:4893-4903, 2021.

Abstract

For machine learning models to make reliable predictions in deployment, one needs to ensure the previously unknown test samples need to be sufficiently similar to the training data. The commonly used shift-invariant kernels do not have the compositionality and fail to capture invariances in high-dimensional data in computer vision. We propose a shift-invariant convolutional neural tangent kernel (SCNTK) based outlier detector and two-sample tests with maximum mean discrepancy (MMD) that is O(n) in the number of samples due to using the random feature approximation. On MNIST and CIFAR10 with various types of dataset shifts, we empirically show that statistical tests with such compositional kernels, inherited from infinitely wide neural networks, achieve higher detection accuracy than existing non-parametric methods. Our method also provides a competitive alternative to adapted kernel methods that require a training phase.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-jia21a, title = {Efficient Statistical Tests: A Neural Tangent Kernel Approach}, author = {Jia, Sheng and Nezhadarya, Ehsan and Wu, Yuhuai and Ba, Jimmy}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {4893--4903}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/jia21a/jia21a.pdf}, url = {https://proceedings.mlr.press/v139/jia21a.html}, abstract = {For machine learning models to make reliable predictions in deployment, one needs to ensure the previously unknown test samples need to be sufficiently similar to the training data. The commonly used shift-invariant kernels do not have the compositionality and fail to capture invariances in high-dimensional data in computer vision. We propose a shift-invariant convolutional neural tangent kernel (SCNTK) based outlier detector and two-sample tests with maximum mean discrepancy (MMD) that is O(n) in the number of samples due to using the random feature approximation. On MNIST and CIFAR10 with various types of dataset shifts, we empirically show that statistical tests with such compositional kernels, inherited from infinitely wide neural networks, achieve higher detection accuracy than existing non-parametric methods. Our method also provides a competitive alternative to adapted kernel methods that require a training phase.} }
Endnote
%0 Conference Paper %T Efficient Statistical Tests: A Neural Tangent Kernel Approach %A Sheng Jia %A Ehsan Nezhadarya %A Yuhuai Wu %A Jimmy Ba %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-jia21a %I PMLR %P 4893--4903 %U https://proceedings.mlr.press/v139/jia21a.html %V 139 %X For machine learning models to make reliable predictions in deployment, one needs to ensure the previously unknown test samples need to be sufficiently similar to the training data. The commonly used shift-invariant kernels do not have the compositionality and fail to capture invariances in high-dimensional data in computer vision. We propose a shift-invariant convolutional neural tangent kernel (SCNTK) based outlier detector and two-sample tests with maximum mean discrepancy (MMD) that is O(n) in the number of samples due to using the random feature approximation. On MNIST and CIFAR10 with various types of dataset shifts, we empirically show that statistical tests with such compositional kernels, inherited from infinitely wide neural networks, achieve higher detection accuracy than existing non-parametric methods. Our method also provides a competitive alternative to adapted kernel methods that require a training phase.
APA
Jia, S., Nezhadarya, E., Wu, Y. & Ba, J.. (2021). Efficient Statistical Tests: A Neural Tangent Kernel Approach. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:4893-4903 Available from https://proceedings.mlr.press/v139/jia21a.html.

Related Material