Learning from Human-Generated Lists

Kwang-Sung Jun, Jerry Zhu, Burr Settles, Timothy Rogers
Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):181-189, 2013.

Abstract

Human-generated lists are a form of non-iid data with important applications in machine learning and cognitive psychology. We propose a generative model - sampling with reduced replacement (SWIRL) - for such lists. We discuss SWIRL’s relation to standard sampling paradigms, provide the maximum likelihood estimate for learning, and demonstrate its value with two real-world applications: (i) In a ""feature volunteering"" task where non-experts spontaneously generate feature=>label pairs for text classification, SWIRL improves the accuracy of state-of-the-art feature-learning frameworks. (ii) In a ""verbal fluency"" task where brain-damaged patients generate word lists when prompted with a category, SWIRL parameters align well with existing psychological theories, and our model can classify healthy people vs. patients from the lists they generate.

Cite this Paper


BibTeX
@InProceedings{pmlr-v28-jun13, title = {Learning from Human-Generated Lists}, author = {Jun, Kwang-Sung and Zhu, Jerry and Settles, Burr and Rogers, Timothy}, booktitle = {Proceedings of the 30th International Conference on Machine Learning}, pages = {181--189}, year = {2013}, editor = {Dasgupta, Sanjoy and McAllester, David}, volume = {28}, number = {3}, series = {Proceedings of Machine Learning Research}, address = {Atlanta, Georgia, USA}, month = {17--19 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v28/jun13.pdf}, url = {https://proceedings.mlr.press/v28/jun13.html}, abstract = {Human-generated lists are a form of non-iid data with important applications in machine learning and cognitive psychology. We propose a generative model - sampling with reduced replacement (SWIRL) - for such lists. We discuss SWIRL’s relation to standard sampling paradigms, provide the maximum likelihood estimate for learning, and demonstrate its value with two real-world applications: (i) In a ""feature volunteering"" task where non-experts spontaneously generate feature=>label pairs for text classification, SWIRL improves the accuracy of state-of-the-art feature-learning frameworks. (ii) In a ""verbal fluency"" task where brain-damaged patients generate word lists when prompted with a category, SWIRL parameters align well with existing psychological theories, and our model can classify healthy people vs. patients from the lists they generate. } }
Endnote
%0 Conference Paper %T Learning from Human-Generated Lists %A Kwang-Sung Jun %A Jerry Zhu %A Burr Settles %A Timothy Rogers %B Proceedings of the 30th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2013 %E Sanjoy Dasgupta %E David McAllester %F pmlr-v28-jun13 %I PMLR %P 181--189 %U https://proceedings.mlr.press/v28/jun13.html %V 28 %N 3 %X Human-generated lists are a form of non-iid data with important applications in machine learning and cognitive psychology. We propose a generative model - sampling with reduced replacement (SWIRL) - for such lists. We discuss SWIRL’s relation to standard sampling paradigms, provide the maximum likelihood estimate for learning, and demonstrate its value with two real-world applications: (i) In a ""feature volunteering"" task where non-experts spontaneously generate feature=>label pairs for text classification, SWIRL improves the accuracy of state-of-the-art feature-learning frameworks. (ii) In a ""verbal fluency"" task where brain-damaged patients generate word lists when prompted with a category, SWIRL parameters align well with existing psychological theories, and our model can classify healthy people vs. patients from the lists they generate.
RIS
TY - CPAPER TI - Learning from Human-Generated Lists AU - Kwang-Sung Jun AU - Jerry Zhu AU - Burr Settles AU - Timothy Rogers BT - Proceedings of the 30th International Conference on Machine Learning DA - 2013/05/26 ED - Sanjoy Dasgupta ED - David McAllester ID - pmlr-v28-jun13 PB - PMLR DP - Proceedings of Machine Learning Research VL - 28 IS - 3 SP - 181 EP - 189 L1 - http://proceedings.mlr.press/v28/jun13.pdf UR - https://proceedings.mlr.press/v28/jun13.html AB - Human-generated lists are a form of non-iid data with important applications in machine learning and cognitive psychology. We propose a generative model - sampling with reduced replacement (SWIRL) - for such lists. We discuss SWIRL’s relation to standard sampling paradigms, provide the maximum likelihood estimate for learning, and demonstrate its value with two real-world applications: (i) In a ""feature volunteering"" task where non-experts spontaneously generate feature=>label pairs for text classification, SWIRL improves the accuracy of state-of-the-art feature-learning frameworks. (ii) In a ""verbal fluency"" task where brain-damaged patients generate word lists when prompted with a category, SWIRL parameters align well with existing psychological theories, and our model can classify healthy people vs. patients from the lists they generate. ER -
APA
Jun, K., Zhu, J., Settles, B. & Rogers, T.. (2013). Learning from Human-Generated Lists. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(3):181-189 Available from https://proceedings.mlr.press/v28/jun13.html.

Related Material