Generalized Aitchison Embeddings for Histograms

Tam Le; Marco Cuturi

Generalized Aitchison Embeddings for Histograms

Tam Le, Marco Cuturi

Proceedings of the 5th Asian Conference on Machine Learning, PMLR 29:293-308, 2013.

Abstract

Learning distances that are specifically designed to compare histograms in the probability simplex has recently attracted the attention of the community. Learning such distances is important because most machine learning problems involve bags of features rather than simple vectors. Ample empirical evidence suggests that the Euclidean distance in general and Mahalanobis metric learning in particular may not be suitable to quantify distances between points in the simplex. We propose in this paper a new contribution to address this problem by generalizing a family of embeddings proposed by Aitchison (1982) to map the probability simplex onto a suitable Euclidean space. We provide algorithms to estimate the parameters of such maps, and show that these algorithms lead to representations that outperform alternative approaches to compare histograms.

Cite this Paper

BibTeX

@InProceedings{pmlr-v29-Le13,
  title = 	 {Generalized Aitchison Embeddings for Histograms},
  author = 	 {Le, Tam and Cuturi, Marco},
  booktitle = 	 {Proceedings of the 5th Asian Conference on Machine Learning},
  pages = 	 {293--308},
  year = 	 {2013},
  editor = 	 {Ong, Cheng Soon and Ho, Tu Bao},
  volume = 	 {29},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Australian National University, Canberra, Australia},
  month = 	 {13--15 Nov},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v29/Le13.pdf},
  url = 	 {https://proceedings.mlr.press/v29/Le13.html},
  abstract = 	 {Learning distances that are specifically designed to compare histograms in the probability simplex has recently attracted the attention of the community. Learning such distances is important because most machine learning problems involve bags of features rather than simple vectors. Ample empirical evidence suggests that the Euclidean distance in general and Mahalanobis metric learning in particular may not be suitable to quantify distances between points in the simplex. We propose in this paper a new contribution to address this problem by generalizing a family of embeddings proposed by Aitchison (1982) to map the probability simplex onto a suitable Euclidean space. We provide algorithms to estimate the parameters of such maps, and show that these algorithms lead to representations that outperform alternative approaches to compare histograms.}
}

Endnote

%0 Conference Paper
%T Generalized Aitchison Embeddings for Histograms
%A Tam Le
%A Marco Cuturi
%B Proceedings of the 5th Asian Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2013
%E Cheng Soon Ong
%E Tu Bao Ho	
%F pmlr-v29-Le13
%I PMLR
%P 293--308
%U https://proceedings.mlr.press/v29/Le13.html
%V 29
%X Learning distances that are specifically designed to compare histograms in the probability simplex has recently attracted the attention of the community. Learning such distances is important because most machine learning problems involve bags of features rather than simple vectors. Ample empirical evidence suggests that the Euclidean distance in general and Mahalanobis metric learning in particular may not be suitable to quantify distances between points in the simplex. We propose in this paper a new contribution to address this problem by generalizing a family of embeddings proposed by Aitchison (1982) to map the probability simplex onto a suitable Euclidean space. We provide algorithms to estimate the parameters of such maps, and show that these algorithms lead to representations that outperform alternative approaches to compare histograms.

RIS

TY  - CPAPER
TI  - Generalized Aitchison Embeddings for Histograms
AU  - Tam Le
AU  - Marco Cuturi
BT  - Proceedings of the 5th Asian Conference on Machine Learning
DA  - 2013/10/21
ED  - Cheng Soon Ong
ED  - Tu Bao Ho	
ID  - pmlr-v29-Le13
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 29
SP  - 293
EP  - 308
L1  - http://proceedings.mlr.press/v29/Le13.pdf
UR  - https://proceedings.mlr.press/v29/Le13.html
AB  - Learning distances that are specifically designed to compare histograms in the probability simplex has recently attracted the attention of the community. Learning such distances is important because most machine learning problems involve bags of features rather than simple vectors. Ample empirical evidence suggests that the Euclidean distance in general and Mahalanobis metric learning in particular may not be suitable to quantify distances between points in the simplex. We propose in this paper a new contribution to address this problem by generalizing a family of embeddings proposed by Aitchison (1982) to map the probability simplex onto a suitable Euclidean space. We provide algorithms to estimate the parameters of such maps, and show that these algorithms lead to representations that outperform alternative approaches to compare histograms.
ER  -

APA

Le, T. & Cuturi, M.. (2013). Generalized Aitchison Embeddings for Histograms. Proceedings of the 5th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 29:293-308 Available from https://proceedings.mlr.press/v29/Le13.html.

Related Material

Download PDF