Generative Modeling for Maximizing Precision and Recall in Information Visualization

Jaakko Peltonen, Samuel Kaski
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, PMLR 15:579-587, 2011.

Abstract

Information visualization has recently been formulated as an information retrieval problem, where the goal is to find similar data points based on the visualized nonlinear projection, and the visualization is optimized to maximize a compromise between (smoothed) precision and recall. We turn the visualization into a generative modeling task where a simple user model parameterized by the data coordinates is optimized, neighborhood relations are the observed data, and straightforward maximum likelihood estimation corresponds to Stochastic Neighbor Embedding (SNE). While SNE maximizes pure recall, adding a mixture component that “explains away” misses allows our generative model to focus on maximizing precision as well. The resulting model is a generative solution to maximizing tradeoffs between precision and recall. The model outperforms earlier models in terms of precision and recall and in external validation by unsupervised classification.

Cite this Paper


BibTeX
@InProceedings{pmlr-v15-peltonen11a, title = {Generative Modeling for Maximizing Precision and Recall in Information Visualization}, author = {Peltonen, Jaakko and Kaski, Samuel}, booktitle = {Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics}, pages = {579--587}, year = {2011}, editor = {Gordon, Geoffrey and Dunson, David and Dudík, Miroslav}, volume = {15}, series = {Proceedings of Machine Learning Research}, address = {Fort Lauderdale, FL, USA}, month = {11--13 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v15/peltonen11a/peltonen11a.pdf}, url = {https://proceedings.mlr.press/v15/peltonen11a.html}, abstract = {Information visualization has recently been formulated as an information retrieval problem, where the goal is to find similar data points based on the visualized nonlinear projection, and the visualization is optimized to maximize a compromise between (smoothed) precision and recall. We turn the visualization into a generative modeling task where a simple user model parameterized by the data coordinates is optimized, neighborhood relations are the observed data, and straightforward maximum likelihood estimation corresponds to Stochastic Neighbor Embedding (SNE). While SNE maximizes pure recall, adding a mixture component that “explains away” misses allows our generative model to focus on maximizing precision as well. The resulting model is a generative solution to maximizing tradeoffs between precision and recall. The model outperforms earlier models in terms of precision and recall and in external validation by unsupervised classification.} }
Endnote
%0 Conference Paper %T Generative Modeling for Maximizing Precision and Recall in Information Visualization %A Jaakko Peltonen %A Samuel Kaski %B Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2011 %E Geoffrey Gordon %E David Dunson %E Miroslav Dudík %F pmlr-v15-peltonen11a %I PMLR %P 579--587 %U https://proceedings.mlr.press/v15/peltonen11a.html %V 15 %X Information visualization has recently been formulated as an information retrieval problem, where the goal is to find similar data points based on the visualized nonlinear projection, and the visualization is optimized to maximize a compromise between (smoothed) precision and recall. We turn the visualization into a generative modeling task where a simple user model parameterized by the data coordinates is optimized, neighborhood relations are the observed data, and straightforward maximum likelihood estimation corresponds to Stochastic Neighbor Embedding (SNE). While SNE maximizes pure recall, adding a mixture component that “explains away” misses allows our generative model to focus on maximizing precision as well. The resulting model is a generative solution to maximizing tradeoffs between precision and recall. The model outperforms earlier models in terms of precision and recall and in external validation by unsupervised classification.
RIS
TY - CPAPER TI - Generative Modeling for Maximizing Precision and Recall in Information Visualization AU - Jaakko Peltonen AU - Samuel Kaski BT - Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics DA - 2011/06/14 ED - Geoffrey Gordon ED - David Dunson ED - Miroslav Dudík ID - pmlr-v15-peltonen11a PB - PMLR DP - Proceedings of Machine Learning Research VL - 15 SP - 579 EP - 587 L1 - http://proceedings.mlr.press/v15/peltonen11a/peltonen11a.pdf UR - https://proceedings.mlr.press/v15/peltonen11a.html AB - Information visualization has recently been formulated as an information retrieval problem, where the goal is to find similar data points based on the visualized nonlinear projection, and the visualization is optimized to maximize a compromise between (smoothed) precision and recall. We turn the visualization into a generative modeling task where a simple user model parameterized by the data coordinates is optimized, neighborhood relations are the observed data, and straightforward maximum likelihood estimation corresponds to Stochastic Neighbor Embedding (SNE). While SNE maximizes pure recall, adding a mixture component that “explains away” misses allows our generative model to focus on maximizing precision as well. The resulting model is a generative solution to maximizing tradeoffs between precision and recall. The model outperforms earlier models in terms of precision and recall and in external validation by unsupervised classification. ER -
APA
Peltonen, J. & Kaski, S.. (2011). Generative Modeling for Maximizing Precision and Recall in Information Visualization. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 15:579-587 Available from https://proceedings.mlr.press/v15/peltonen11a.html.

Related Material