Active Learning with Logged Data

Songbai Yan, Kamalika Chaudhuri, Tara Javidi
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:5521-5530, 2018.

Abstract

We consider active learning with logged data, where labeled examples are drawn conditioned on a predetermined logging policy, and the goal is to learn a classifier on the entire population, not just conditioned on the logging policy. Prior work addresses this problem either when only logged data is available, or purely in a controlled random experimentation setting where the logged data is ignored. In this work, we combine both approaches to provide an algorithm that uses logged data to bootstrap and inform experimentation, thus achieving the best of both worlds. Our work is inspired by a connection between controlled random experimentation and active learning, and modifies existing disagreement-based active learning algorithms to exploit logged data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-yan18a, title = {Active Learning with Logged Data}, author = {Yan, Songbai and Chaudhuri, Kamalika and Javidi, Tara}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {5521--5530}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/yan18a/yan18a.pdf}, url = {https://proceedings.mlr.press/v80/yan18a.html}, abstract = {We consider active learning with logged data, where labeled examples are drawn conditioned on a predetermined logging policy, and the goal is to learn a classifier on the entire population, not just conditioned on the logging policy. Prior work addresses this problem either when only logged data is available, or purely in a controlled random experimentation setting where the logged data is ignored. In this work, we combine both approaches to provide an algorithm that uses logged data to bootstrap and inform experimentation, thus achieving the best of both worlds. Our work is inspired by a connection between controlled random experimentation and active learning, and modifies existing disagreement-based active learning algorithms to exploit logged data.} }
Endnote
%0 Conference Paper %T Active Learning with Logged Data %A Songbai Yan %A Kamalika Chaudhuri %A Tara Javidi %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-yan18a %I PMLR %P 5521--5530 %U https://proceedings.mlr.press/v80/yan18a.html %V 80 %X We consider active learning with logged data, where labeled examples are drawn conditioned on a predetermined logging policy, and the goal is to learn a classifier on the entire population, not just conditioned on the logging policy. Prior work addresses this problem either when only logged data is available, or purely in a controlled random experimentation setting where the logged data is ignored. In this work, we combine both approaches to provide an algorithm that uses logged data to bootstrap and inform experimentation, thus achieving the best of both worlds. Our work is inspired by a connection between controlled random experimentation and active learning, and modifies existing disagreement-based active learning algorithms to exploit logged data.
APA
Yan, S., Chaudhuri, K. & Javidi, T.. (2018). Active Learning with Logged Data. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:5521-5530 Available from https://proceedings.mlr.press/v80/yan18a.html.

Related Material