A Sub-Quadratic Exact Medoid Algorithm

James Newling, Francois Fleuret
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, PMLR 54:185-193, 2017.

Abstract

We present a new algorithm, ‘trimed’ for obtaining the medoid of a set, that is the element of the set which minimises the mean distance to all other elements. The algorithm is shown to have, under certain assumptions, expected run time $O(N^(3/2))$ in $R^d$ where N is the set size, making it the first sub-quadratic exact medoid algorithm for $d > 1$. Experiments show that it performs very well on spatial network data, frequently requiring two orders of magnitude fewer distance calculations than state-of-the-art approximate algorithms. As an application, we show how trimed can be used as a component in an accelerated K-medoids algorithm, and then how it can be relaxed to obtain further computational gains with only a minor loss in cluster quality.

Cite this Paper


BibTeX
@InProceedings{pmlr-v54-newling17a, title = {{A Sub-Quadratic Exact Medoid Algorithm}}, author = {Newling, James and Fleuret, Francois}, booktitle = {Proceedings of the 20th International Conference on Artificial Intelligence and Statistics}, pages = {185--193}, year = {2017}, editor = {Singh, Aarti and Zhu, Jerry}, volume = {54}, series = {Proceedings of Machine Learning Research}, month = {20--22 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v54/newling17a/newling17a.pdf}, url = {https://proceedings.mlr.press/v54/newling17a.html}, abstract = {We present a new algorithm, ‘trimed’ for obtaining the medoid of a set, that is the element of the set which minimises the mean distance to all other elements. The algorithm is shown to have, under certain assumptions, expected run time $O(N^(3/2))$ in $R^d$ where N is the set size, making it the first sub-quadratic exact medoid algorithm for $d > 1$. Experiments show that it performs very well on spatial network data, frequently requiring two orders of magnitude fewer distance calculations than state-of-the-art approximate algorithms. As an application, we show how trimed can be used as a component in an accelerated K-medoids algorithm, and then how it can be relaxed to obtain further computational gains with only a minor loss in cluster quality. } }
Endnote
%0 Conference Paper %T A Sub-Quadratic Exact Medoid Algorithm %A James Newling %A Francois Fleuret %B Proceedings of the 20th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2017 %E Aarti Singh %E Jerry Zhu %F pmlr-v54-newling17a %I PMLR %P 185--193 %U https://proceedings.mlr.press/v54/newling17a.html %V 54 %X We present a new algorithm, ‘trimed’ for obtaining the medoid of a set, that is the element of the set which minimises the mean distance to all other elements. The algorithm is shown to have, under certain assumptions, expected run time $O(N^(3/2))$ in $R^d$ where N is the set size, making it the first sub-quadratic exact medoid algorithm for $d > 1$. Experiments show that it performs very well on spatial network data, frequently requiring two orders of magnitude fewer distance calculations than state-of-the-art approximate algorithms. As an application, we show how trimed can be used as a component in an accelerated K-medoids algorithm, and then how it can be relaxed to obtain further computational gains with only a minor loss in cluster quality.
APA
Newling, J. & Fleuret, F.. (2017). A Sub-Quadratic Exact Medoid Algorithm. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 54:185-193 Available from https://proceedings.mlr.press/v54/newling17a.html.

Related Material