Competitive Classification and Closeness Testing

Jayadev Acharya, Hirakendu Das, Ashkan Jafarpour, Alon Orlitsky, Shengjun Pan, Ananda Suresh
Proceedings of the 25th Annual Conference on Learning Theory, PMLR 23:22.1-22.18, 2012.

Abstract

We study the problems of \emphclassification and \emphcloseness testing. A \emphclassifier associates a test sequence with the one of two training sequences that was generated by the same distribution. A \emphcloseness test determines whether two sequences were generated by the same or by different distributions. For both problems all natural algorithms are \emphsymmetric – they make the same decision under all symbol relabelings. With no assumptions on the distributions’ support size or relative distance, we construct a classifier and closeness test that require at most O(n^3/2) samples to attain the n-sample accuracy of the best symmetric classifier or closeness test designed with knowledge of the underlying distributions. Both algorithms run in time linear in the number of samples. Conversely we also show that for any classifier or closeness test, there are distributions that require Ω(n^7/6) samples to achieve the n-sample accuracy of the best symmetric algorithm that knows the underlying distributions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v23-acharya12, title = {Competitive Classification and Closeness Testing}, author = {Acharya, Jayadev and Das, Hirakendu and Jafarpour, Ashkan and Orlitsky, Alon and Pan, Shengjun and Suresh, Ananda}, booktitle = {Proceedings of the 25th Annual Conference on Learning Theory}, pages = {22.1--22.18}, year = {2012}, editor = {Mannor, Shie and Srebro, Nathan and Williamson, Robert C.}, volume = {23}, series = {Proceedings of Machine Learning Research}, address = {Edinburgh, Scotland}, month = {25--27 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v23/acharya12/acharya12.pdf}, url = {https://proceedings.mlr.press/v23/acharya12.html}, abstract = {We study the problems of \emphclassification and \emphcloseness testing. A \emphclassifier associates a test sequence with the one of two training sequences that was generated by the same distribution. A \emphcloseness test determines whether two sequences were generated by the same or by different distributions. For both problems all natural algorithms are \emphsymmetric – they make the same decision under all symbol relabelings. With no assumptions on the distributions’ support size or relative distance, we construct a classifier and closeness test that require at most O(n^3/2) samples to attain the n-sample accuracy of the best symmetric classifier or closeness test designed with knowledge of the underlying distributions. Both algorithms run in time linear in the number of samples. Conversely we also show that for any classifier or closeness test, there are distributions that require Ω(n^7/6) samples to achieve the n-sample accuracy of the best symmetric algorithm that knows the underlying distributions.} }
Endnote
%0 Conference Paper %T Competitive Classification and Closeness Testing %A Jayadev Acharya %A Hirakendu Das %A Ashkan Jafarpour %A Alon Orlitsky %A Shengjun Pan %A Ananda Suresh %B Proceedings of the 25th Annual Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2012 %E Shie Mannor %E Nathan Srebro %E Robert C. Williamson %F pmlr-v23-acharya12 %I PMLR %P 22.1--22.18 %U https://proceedings.mlr.press/v23/acharya12.html %V 23 %X We study the problems of \emphclassification and \emphcloseness testing. A \emphclassifier associates a test sequence with the one of two training sequences that was generated by the same distribution. A \emphcloseness test determines whether two sequences were generated by the same or by different distributions. For both problems all natural algorithms are \emphsymmetric – they make the same decision under all symbol relabelings. With no assumptions on the distributions’ support size or relative distance, we construct a classifier and closeness test that require at most O(n^3/2) samples to attain the n-sample accuracy of the best symmetric classifier or closeness test designed with knowledge of the underlying distributions. Both algorithms run in time linear in the number of samples. Conversely we also show that for any classifier or closeness test, there are distributions that require Ω(n^7/6) samples to achieve the n-sample accuracy of the best symmetric algorithm that knows the underlying distributions.
RIS
TY - CPAPER TI - Competitive Classification and Closeness Testing AU - Jayadev Acharya AU - Hirakendu Das AU - Ashkan Jafarpour AU - Alon Orlitsky AU - Shengjun Pan AU - Ananda Suresh BT - Proceedings of the 25th Annual Conference on Learning Theory DA - 2012/06/16 ED - Shie Mannor ED - Nathan Srebro ED - Robert C. Williamson ID - pmlr-v23-acharya12 PB - PMLR DP - Proceedings of Machine Learning Research VL - 23 SP - 22.1 EP - 22.18 L1 - http://proceedings.mlr.press/v23/acharya12/acharya12.pdf UR - https://proceedings.mlr.press/v23/acharya12.html AB - We study the problems of \emphclassification and \emphcloseness testing. A \emphclassifier associates a test sequence with the one of two training sequences that was generated by the same distribution. A \emphcloseness test determines whether two sequences were generated by the same or by different distributions. For both problems all natural algorithms are \emphsymmetric – they make the same decision under all symbol relabelings. With no assumptions on the distributions’ support size or relative distance, we construct a classifier and closeness test that require at most O(n^3/2) samples to attain the n-sample accuracy of the best symmetric classifier or closeness test designed with knowledge of the underlying distributions. Both algorithms run in time linear in the number of samples. Conversely we also show that for any classifier or closeness test, there are distributions that require Ω(n^7/6) samples to achieve the n-sample accuracy of the best symmetric algorithm that knows the underlying distributions. ER -
APA
Acharya, J., Das, H., Jafarpour, A., Orlitsky, A., Pan, S. & Suresh, A.. (2012). Competitive Classification and Closeness Testing. Proceedings of the 25th Annual Conference on Learning Theory, in Proceedings of Machine Learning Research 23:22.1-22.18 Available from https://proceedings.mlr.press/v23/acharya12.html.

Related Material