Efficient PAC Learning from the Crowd with Pairwise Comparisons

Shiwei Zeng, Jie Shen
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:25973-25993, 2022.

Abstract

We study crowdsourced PAC learning of threshold function, where the labels are gathered from a pool of annotators some of whom may behave adversarially. This is yet a challenging problem and until recently has computationally and query efficient PAC learning algorithm been established by Awasthi et al. (2017). In this paper, we show that by leveraging the more easily acquired pairwise comparison queries, it is possible to exponentially reduce the label complexity while retaining the overall query complexity and runtime. Our main algorithmic contributions are a comparison-equipped labeling scheme that can faithfully recover the true labels of a small set of instances, and a label-efficient filtering process that in conjunction with the small labeled set can reliably infer the true labels of a large instance set.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-zeng22b, title = {Efficient {PAC} Learning from the Crowd with Pairwise Comparisons}, author = {Zeng, Shiwei and Shen, Jie}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {25973--25993}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/zeng22b/zeng22b.pdf}, url = {https://proceedings.mlr.press/v162/zeng22b.html}, abstract = {We study crowdsourced PAC learning of threshold function, where the labels are gathered from a pool of annotators some of whom may behave adversarially. This is yet a challenging problem and until recently has computationally and query efficient PAC learning algorithm been established by Awasthi et al. (2017). In this paper, we show that by leveraging the more easily acquired pairwise comparison queries, it is possible to exponentially reduce the label complexity while retaining the overall query complexity and runtime. Our main algorithmic contributions are a comparison-equipped labeling scheme that can faithfully recover the true labels of a small set of instances, and a label-efficient filtering process that in conjunction with the small labeled set can reliably infer the true labels of a large instance set.} }
Endnote
%0 Conference Paper %T Efficient PAC Learning from the Crowd with Pairwise Comparisons %A Shiwei Zeng %A Jie Shen %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-zeng22b %I PMLR %P 25973--25993 %U https://proceedings.mlr.press/v162/zeng22b.html %V 162 %X We study crowdsourced PAC learning of threshold function, where the labels are gathered from a pool of annotators some of whom may behave adversarially. This is yet a challenging problem and until recently has computationally and query efficient PAC learning algorithm been established by Awasthi et al. (2017). In this paper, we show that by leveraging the more easily acquired pairwise comparison queries, it is possible to exponentially reduce the label complexity while retaining the overall query complexity and runtime. Our main algorithmic contributions are a comparison-equipped labeling scheme that can faithfully recover the true labels of a small set of instances, and a label-efficient filtering process that in conjunction with the small labeled set can reliably infer the true labels of a large instance set.
APA
Zeng, S. & Shen, J.. (2022). Efficient PAC Learning from the Crowd with Pairwise Comparisons. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:25973-25993 Available from https://proceedings.mlr.press/v162/zeng22b.html.

Related Material