CURI: A Benchmark for Productive Concept Learning Under Uncertainty

Ramakrishna Vedantam, Arthur Szlam, Maximillian Nickel, Ari Morcos, Brenden M Lake
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:10519-10529, 2021.

Abstract

Humans can learn and reason under substantial uncertainty in a space of infinitely many compositional, productive concepts. For example, if a scene with two blue spheres qualifies as “daxy,” one can reason that the underlying concept may require scenes to have “only blue spheres” or “only spheres” or “only two objects.” In contrast, standard benchmarks for compositional reasoning do not explicitly capture a notion of reasoning under uncertainty or evaluate compositional concept acquisition. We introduce a new benchmark, Compositional Reasoning Under Uncertainty (CURI) that instantiates a series of few-shot, meta-learning tasks in a productive concept space to evaluate different aspects of systematic generalization under uncertainty, including splits that test abstract understandings of disentangling, productive generalization, learning boolean operations, variable binding, etc. Importantly, we also contribute a model-independent “compositionality gap” to evaluate the difficulty of generalizing out-of-distribution along each of these axes, allowing objective comparison of the difficulty of each compositional split. Evaluations across a range of modeling choices and splits reveal substantial room for improvement on the proposed benchmark.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-vedantam21a, title = {CURI: A Benchmark for Productive Concept Learning Under Uncertainty}, author = {Vedantam, Ramakrishna and Szlam, Arthur and Nickel, Maximillian and Morcos, Ari and Lake, Brenden M}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {10519--10529}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/vedantam21a/vedantam21a.pdf}, url = {https://proceedings.mlr.press/v139/vedantam21a.html}, abstract = {Humans can learn and reason under substantial uncertainty in a space of infinitely many compositional, productive concepts. For example, if a scene with two blue spheres qualifies as “daxy,” one can reason that the underlying concept may require scenes to have “only blue spheres” or “only spheres” or “only two objects.” In contrast, standard benchmarks for compositional reasoning do not explicitly capture a notion of reasoning under uncertainty or evaluate compositional concept acquisition. We introduce a new benchmark, Compositional Reasoning Under Uncertainty (CURI) that instantiates a series of few-shot, meta-learning tasks in a productive concept space to evaluate different aspects of systematic generalization under uncertainty, including splits that test abstract understandings of disentangling, productive generalization, learning boolean operations, variable binding, etc. Importantly, we also contribute a model-independent “compositionality gap” to evaluate the difficulty of generalizing out-of-distribution along each of these axes, allowing objective comparison of the difficulty of each compositional split. Evaluations across a range of modeling choices and splits reveal substantial room for improvement on the proposed benchmark.} }
Endnote
%0 Conference Paper %T CURI: A Benchmark for Productive Concept Learning Under Uncertainty %A Ramakrishna Vedantam %A Arthur Szlam %A Maximillian Nickel %A Ari Morcos %A Brenden M Lake %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-vedantam21a %I PMLR %P 10519--10529 %U https://proceedings.mlr.press/v139/vedantam21a.html %V 139 %X Humans can learn and reason under substantial uncertainty in a space of infinitely many compositional, productive concepts. For example, if a scene with two blue spheres qualifies as “daxy,” one can reason that the underlying concept may require scenes to have “only blue spheres” or “only spheres” or “only two objects.” In contrast, standard benchmarks for compositional reasoning do not explicitly capture a notion of reasoning under uncertainty or evaluate compositional concept acquisition. We introduce a new benchmark, Compositional Reasoning Under Uncertainty (CURI) that instantiates a series of few-shot, meta-learning tasks in a productive concept space to evaluate different aspects of systematic generalization under uncertainty, including splits that test abstract understandings of disentangling, productive generalization, learning boolean operations, variable binding, etc. Importantly, we also contribute a model-independent “compositionality gap” to evaluate the difficulty of generalizing out-of-distribution along each of these axes, allowing objective comparison of the difficulty of each compositional split. Evaluations across a range of modeling choices and splits reveal substantial room for improvement on the proposed benchmark.
APA
Vedantam, R., Szlam, A., Nickel, M., Morcos, A. & Lake, B.M.. (2021). CURI: A Benchmark for Productive Concept Learning Under Uncertainty. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:10519-10529 Available from https://proceedings.mlr.press/v139/vedantam21a.html.

Related Material