It’s Not What Machines Can Learn, It’s What We Cannot Teach

Gal Yehuda; Moshe Gabel; Assaf Schuster

It’s Not What Machines Can Learn, It’s What We Cannot Teach

Gal Yehuda, Moshe Gabel, Assaf Schuster

Proceedings of the 37th International Conference on Machine Learning, PMLR 119:10831-10841, 2020.

Abstract

Can deep neural networks learn to solve any task, and in particular problems of high complexity? This question attracts a lot of interest, with recent works tackling computationally hard tasks such as the traveling salesman problem and satisfiability. In this work we offer a different perspective on this question. Given the common assumption that NP != coNP we prove that any polynomial-time sample generator for an NP-hard problem samples, in fact, from an easier sub-problem. We empirically explore a case study, Conjunctive Query Containment, and show how common data generation techniques generate biased data-sets that lead practitioners to over-estimate model accuracy. Our results suggest that machine learning approaches that require training on a dense uniform sampling from the target distribution cannot be used to solve computationally hard problems, the reason being the difficulty of generating sufficiently large and unbiased training sets.

Cite this Paper

BibTeX

@InProceedings{pmlr-v119-yehuda20a,
  title = 	 {It’s Not What Machines Can Learn, It’s What We Cannot Teach},
  author =       {Yehuda, Gal and Gabel, Moshe and Schuster, Assaf},
  booktitle = 	 {Proceedings of the 37th International Conference on Machine Learning},
  pages = 	 {10831--10841},
  year = 	 {2020},
  editor = 	 {III, Hal Daumé and Singh, Aarti},
  volume = 	 {119},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--18 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v119/yehuda20a/yehuda20a.pdf},
  url = 	 {https://proceedings.mlr.press/v119/yehuda20a.html},
  abstract = 	 {Can deep neural networks learn to solve any task, and in particular problems of high complexity? This question attracts a lot of interest, with recent works tackling computationally hard tasks such as the traveling salesman problem and satisfiability. In this work we offer a different perspective on this question. Given the common assumption that NP != coNP we prove that any polynomial-time sample generator for an NP-hard problem samples, in fact, from an easier sub-problem. We empirically explore a case study, Conjunctive Query Containment, and show how common data generation techniques generate biased data-sets that lead practitioners to over-estimate model accuracy. Our results suggest that machine learning approaches that require training on a dense uniform sampling from the target distribution cannot be used to solve computationally hard problems, the reason being the difficulty of generating sufficiently large and unbiased training sets.}
}

Endnote

%0 Conference Paper
%T It’s Not What Machines Can Learn, It’s What We Cannot Teach
%A Gal Yehuda
%A Moshe Gabel
%A Assaf Schuster
%B Proceedings of the 37th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2020
%E Hal Daumé III
%E Aarti Singh	
%F pmlr-v119-yehuda20a
%I PMLR
%P 10831--10841
%U https://proceedings.mlr.press/v119/yehuda20a.html
%V 119
%X Can deep neural networks learn to solve any task, and in particular problems of high complexity? This question attracts a lot of interest, with recent works tackling computationally hard tasks such as the traveling salesman problem and satisfiability. In this work we offer a different perspective on this question. Given the common assumption that NP != coNP we prove that any polynomial-time sample generator for an NP-hard problem samples, in fact, from an easier sub-problem. We empirically explore a case study, Conjunctive Query Containment, and show how common data generation techniques generate biased data-sets that lead practitioners to over-estimate model accuracy. Our results suggest that machine learning approaches that require training on a dense uniform sampling from the target distribution cannot be used to solve computationally hard problems, the reason being the difficulty of generating sufficiently large and unbiased training sets.

APA

Yehuda, G., Gabel, M. & Schuster, A.. (2020). It’s Not What Machines Can Learn, It’s What We Cannot Teach. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:10831-10841 Available from https://proceedings.mlr.press/v119/yehuda20a.html.

It’s Not What Machines Can Learn, It’s What We Cannot Teach

Abstract

Cite this Paper

Related Material