Analyzing and Improving Representations with the Soft Nearest Neighbor Loss

Nicholas Frosst, Nicolas Papernot, Geoffrey Hinton
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:2012-2020, 2019.

Abstract

We explore and expand the Soft Nearest Neighbor Loss to measure the entanglement of class manifolds in representation space: i.e., how close pairs of points from the same class are relative to pairs of points from different classes. We demonstrate several use cases of the loss. As an analytical tool, it provides insights into the evolution of class similarity structures during learning. Surprisingly, we find that maximizing the entanglement of representations of different classes in the hidden layers is beneficial for discrimination in the final layer, possibly because it encourages representations to identify class-independent similarity structures. Maximizing the soft nearest neighbor loss in the hidden layers leads not only to better-calibrated estimates of uncertainty on outlier data but also marginally improved generalization. Data that is not from the training distribution can be recognized by observing that in the hidden layers, it has fewer than the normal number of neighbors from the predicted class.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-frosst19a, title = {Analyzing and Improving Representations with the Soft Nearest Neighbor Loss}, author = {Frosst, Nicholas and Papernot, Nicolas and Hinton, Geoffrey}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {2012--2020}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/frosst19a/frosst19a.pdf}, url = {https://proceedings.mlr.press/v97/frosst19a.html}, abstract = {We explore and expand the Soft Nearest Neighbor Loss to measure the entanglement of class manifolds in representation space: i.e., how close pairs of points from the same class are relative to pairs of points from different classes. We demonstrate several use cases of the loss. As an analytical tool, it provides insights into the evolution of class similarity structures during learning. Surprisingly, we find that maximizing the entanglement of representations of different classes in the hidden layers is beneficial for discrimination in the final layer, possibly because it encourages representations to identify class-independent similarity structures. Maximizing the soft nearest neighbor loss in the hidden layers leads not only to better-calibrated estimates of uncertainty on outlier data but also marginally improved generalization. Data that is not from the training distribution can be recognized by observing that in the hidden layers, it has fewer than the normal number of neighbors from the predicted class.} }
Endnote
%0 Conference Paper %T Analyzing and Improving Representations with the Soft Nearest Neighbor Loss %A Nicholas Frosst %A Nicolas Papernot %A Geoffrey Hinton %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-frosst19a %I PMLR %P 2012--2020 %U https://proceedings.mlr.press/v97/frosst19a.html %V 97 %X We explore and expand the Soft Nearest Neighbor Loss to measure the entanglement of class manifolds in representation space: i.e., how close pairs of points from the same class are relative to pairs of points from different classes. We demonstrate several use cases of the loss. As an analytical tool, it provides insights into the evolution of class similarity structures during learning. Surprisingly, we find that maximizing the entanglement of representations of different classes in the hidden layers is beneficial for discrimination in the final layer, possibly because it encourages representations to identify class-independent similarity structures. Maximizing the soft nearest neighbor loss in the hidden layers leads not only to better-calibrated estimates of uncertainty on outlier data but also marginally improved generalization. Data that is not from the training distribution can be recognized by observing that in the hidden layers, it has fewer than the normal number of neighbors from the predicted class.
APA
Frosst, N., Papernot, N. & Hinton, G.. (2019). Analyzing and Improving Representations with the Soft Nearest Neighbor Loss. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:2012-2020 Available from https://proceedings.mlr.press/v97/frosst19a.html.

Related Material