Near-Optimal Machine Teaching via Explanatory Teaching Sets
Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, PMLR 84:1970-1978, 2018.
Modern applications of machine teaching for humans often involve domain-specific, non- trivial target hypothesis classes. To facilitate understanding of the target hypothesis, it is crucial for the teaching algorithm to use examples which are interpretable to the human learner. In this paper, we propose NOTES, a principled framework for constructing interpretable teaching sets, utilizing explanations to accelerate the teaching process. Our algorithm is built upon a natural stochastic model of learners and a novel submodular surrogate objective function which greedily selects interpretable teaching examples. We prove that NOTES is competitive with the optimal explanation-based teaching strategy. We further instantiate NOTES with a specific hypothesis class, which can be viewed as an interpretable approximation of any hypothesis class, allowing us to handle complex hypothesis in practice. We demonstrate the effectiveness of NOTES on several image classification tasks, for both simulated and real human learners. Our experimental results suggest that by leveraging explanations, one can significantly speed up teaching.